Memory leak
1 1

10 posts in this topic

Hey,

I've been trying to figure out why our Ftrack event listener machine is running out of memory. Seems like the event listener is holding onto something, so it does not get garbage collected.

Take a simple event listener like this:

 

import ftrack_api


def test(event):
    session = ftrack_api.Session()


session = ftrack_api.Session(auto_connect_event_hub=True)
session.event_hub.subscribe("topic=ftrack.update", test)
session.event_hub.wait()

It initially starts at 27 Mb, but with every ftrack event that triggers it, a couple of Mb get added. It never gets back down to 27 Mb.

Anyone experiencing the same?

Share this post


Link to post
Share on other sites
On 1/16/2020 at 11:15 PM, Steve Petterborg said:

Hi Toke,

 

Can you try updating to API version 1.8.2 if you have not yet? We added a fix for mem leaks when creating and destroying many Sessions.

Hey Steve,

I tried 1.8.2 and it does not fix our issue. Since other people are not experiencing the same thing, I'm suspecting it might be our environment.

Would greatly appreciate if someone could test on their end with this (conda) environment?

 

name: ftrack-pipeline-environment
channels:
  - defaults
dependencies:
  - certifi=2019.6.16=py27_0
  - pip=19.1.1=py27_0
  - python=2.7.16=hcb6e200_0
  - setuptools=41.0.1=py27_0
  - sqlite=3.29.0=h0c8e037_0
  - vc=9=h7299396_1
  - vs2008_runtime=9.00.30729.1=hfaea7d5_1
  - wheel=0.33.4=py27_0
  - wincertstore=0.2=py27hf04cefb_0
  - pip:
    - arrow==0.14.2
    - backports-functools-lru-cache==1.5
    - cachetools==3.1.1
    - chardet==3.0.4
    - clique==1.5.0
    - ftrack-python-api==1.8.2
    - google-api-python-client==1.7.9
    - google-auth==1.6.3
    - google-auth-httplib2==0.0.3
    - google-auth-oauthlib==0.4.0
    - httplib2==0.13.0
    - idna==2.8
    - jsondiff==1.2.0
    - oauthlib==3.0.2
    - pyasn1==0.4.5
    - pyasn1-modules==0.2.5
    - pyparsing==2.4.0
    - python-dateutil==2.8.0
    - requests==2.22.0
    - requests-oauthlib==1.2.0
    - rsa==4.0
    - six==1.11.0
    - slacker==0.9.65
    - slacker-log-handler==1.7.1
    - termcolor==1.1.0
    - uritemplate==3.0.0
    - urllib3==1.25.3
    - websocket-client==0.56.0
prefix: C:\Users\admin\miniconda\envs\ftrack-pipeline-environment

 

Share this post


Link to post
Share on other sites
5 hours ago, Steve Petterborg said:

Thanks for the env, Toke!

 

What's everyone using for monitoring anyway? I've been using things built around gc.get_objects() and memory-profiler lately.

Good point. Havent been able to trace down the memory leak, so if you have some pointers that would be great!

Share this post


Link to post
Share on other sites
import atexit
import time

import ftrack_api

%%memit 
for _ in xrange(100):
    session = ftrack_api.Session(auto_connect_event_hub=False)
    for index, entry in enumerate(atexit._exithandlers):
        if entry[0] == session.close:
            break
    session.close()
    del atexit._exithandlers[index]

Personally I do a lot of testing/hacking in Jupyter, so I'm using a "magic" annotation here. It also has some side-effects, so I have my own branch of memory-profiler. Lorenzo likes running memory profiler on the command line so he can get a nice graph. https://pypi.org/project/memory-profiler/

 

This was my first attempt at comparing the counts of object types for your mem leak, followed by my second approach to the handle function. In neither case did I create new sessions in the handler.

def dd_compare(one, two):
    one_only = set(one.keys()).difference(two.keys())
    if one_only:
        print 'Types we lost:\n    {}'.format(
            '\n    '.join(str(type_) for type_ in one_only)
        )
    two_only = set(two.keys()).difference(one.keys())
    if two_only:
        print 'Types we gained:\n    {}'.format(
            '\n    '.join(str(type_) for type_ in two_only)
        )
    for key in set(one.keys()).intersection(two.keys()):
        if one[key] == two[key]:
            continue
        print '{}: {} -> {}'.format(key, one[key], two[key])


def handle_event(session, event):
    global after
    before = after
    after = collections.defaultdict(int)
    for i in gc.get_objects():
        after[type(i)] += 1
    print dd_compare(before, after)

    
before = collections.defaultdict(int)
after = collections.defaultdict(int)
for i in gc.get_objects():
    before[type(i)] += 1

session = ftrack_api.Session(auto_connect_event_hub=True)
print session.server_url
handler = partial(handle_event, session)
session.event_hub.subscribe('topic=*', handler)
for i in gc.get_objects():
    before[type(i)] += 1
session.event_hub.wait()
def handle_event(session, event):
    global peaks
    after = collections.defaultdict(int)
    for i in gc.get_objects():
        after[type(i)] += 1
    for type_, count in after.items():
        if count > peaks[type_]:
            print type_, count
            peaks[type_] = count
    print

 

Share this post


Link to post
Share on other sites

There seems to be some issues with caching with this approach. 

Take this example and replace the task id;

 

import functools

import ftrack_api


def callback(session, event):
    task = session.get("Task", "ab3234f0-46a2-11ea-95c7-92527973a98f")
    print(task["status"]["name"])
    print(task["status"]["id"])


session = ftrack_api.Session(auto_connect_event_hub=True)
handler = functools.partial(callback, session)
session.event_hub.subscribe("topic=ftrack.update", handler)
session.event_hub.wait()

 

On initial run, it prints the correct status but on subsequent runs it prints the same status. Tried various "query" and getting the status by id, but nothing seems to get the updated status.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
1 1