Jump to content

ftrack_api threadsafe


Remus Avram

Recommended Posts

Hi Ftrack Team,

we would like to use the ftrack session in threads, but, unfortunately, it seems that ftrack_api is not thread safe.

When we are using the session in multiple threads the return of the attribute value of the entities is a Symbol (NOT SET).

Please find below a script where we were able to reproduce the issue:

from multiprocessing.dummy import Pool as ThreadPool

import ftrack_api
from ftrack_api.symbol import Symbol


session = ftrack_api.Session()


def check_keys(entity):
    for key in entity.keys():
        if isinstance(entity[key], Symbol):
            print entity, ': ', key


def check_children(entity):
    if 'children' in entity.keys():
        for child in entity['children']:
            check_keys(entity=child)
            check_children(entity=child)


def main():
    projects = session.query("Project").all()
    pool = ThreadPool()
    pool.map(check_children, projects)


if __name__ == "__main__":
    main()

ftrack_api version: 1.3.2

ftrack server version: 3.5.6

 

Link to comment
Share on other sites

Hi Mattias,

ahh... are you going to make it thread safe?

The problem is that if it is not specified when the session is created to not auto-populate, then it should never return a Symbol (NOT SET) value.

Creating a session per thread works as expected. But it doesn't help us as the sessions are not connected.

Link to comment
Share on other sites

Thanks @Mattias Lagergren for your answer!

For us it's quite important because we are planing to use threads in all of our tools. We would like to use at least 1 thread in order to not freeze the UI while it is fetching the data.

2 minutes ago, Mattias Lagergren said:

One possible way to move forward with this is to change the example above and pass in the project id to the threads and then query for TypedContext where project_id is <project_id> in the thread.

You mean something like this:

from multiprocessing.dummy import Pool as ThreadPool

import ftrack_api
from ftrack_api.symbol import Symbol


session = ftrack_api.Session()


def check_keys(entity):
    for key in entity.keys():
        if isinstance(entity[key], Symbol):
            print entity, ': ', key


def check_children(entity_id):
    entity = session.get('TypedContext', entity_id)
    if 'children' in entity.keys():
        for child in entity['children']:
            check_keys(entity=child)
            check_children(entity_id=child['id'])


def main():
    projects = session.query("Project").all()
    projects_id = [project["id"] for project in projects]
    pool = ThreadPool()
    pool.map(check_children, projects_id)


if __name__ == "__main__":
    main()

It still doesn't work. In the thread most of the time session.get('TypedContext', entity_id) returns None.

Link to comment
Share on other sites

Thanks @Martin Pengelly-Phillips for the info!

So as I understand the cache is build per session. If there is a session per thread, then for each thread there is a cache file which can contain the same data as the other sessions. Am I correct?

Having only one session, there is only one cache file with all the data and the queries are faster. Less queries to the database.

Do you know if the sessions connected?

I did a test and it seems that they are. I query in assetBuild in one session and I created a task using as parent the assetbuild from the other session and the task was created.

@Martin Pengelly-Phillips: I am interested how are you using the session. Are you creating a new session for each query / commit?

Link to comment
Share on other sites

> @Martin Pengelly-Phillips: I am interested how are you using the session. Are you creating a new session for each query / commit?

No, we don't create a new session for each query or commit, as that would likely be redundant and slow. However, we often have a session (or two) in background threads in order to allow a main UI thread to continue unblocked. We also often run multiple sessions in background threads for event processing in order to avoid event deadlock.

Link to comment
Share on other sites

  • 1 month later...
2 minutes ago, Nebukadhezer said:

we also use one session across threads and I am constantly running into the same behavior.
As a workaround I am setting the auto_populate attribute to True at many places as this seems to be setting itself to False when the same session object is used from multiple threads...

We tried multiple ways. It will not work.

If you are using multi threads, use multiple session.

Link to comment
Share on other sites

  • 5 weeks later...
Quote

It makes sens for us as using multiple session it means creating multiple cache files which increase the number of queries to the DB.

Note that you could also write your own cache interface to connect to a shared cache if you want to reduce DB hits. For example, we run a Redis cache per site and created a RedisCache implementation as a subclass of ftrack_api.cache.Cache.

 

Link to comment
Share on other sites

11 minutes ago, Martin Pengelly-Phillips said:

Note that you could also write your own cache interface to connect to a shared cache if you want to reduce DB hits. For example, we run a Redis cache per site and created a RedisCache implementation as a subclass of ftrack_api.cache.Cache.

 

Thanks for the info! We will try to test this, too.

Link to comment
Share on other sites

  • 2 years later...
  • 4 months later...
  • 1 month later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...