Skip to content

Some files are never closed #60

@bochecha

Description

@bochecha

I'm working on a script using GitPython to loop over lots of Git repos, pull remote changes, etc...

At some point, I noticed that doing repo.is_dirty() was opening some files which never got closed until the process exits, which in turns causes the tool to crash with "too many open files" after iterating over enough repos:

import os

import git

# Put here whatever Git repos you might have in the current folder
# Listing more than one makes the problem more visible
repos = ["ipset", "libhtp"]

raw_input("Check open files with `lsof -p %s | grep %s`" % (os.getpid(),
                                                            os.getcwd()))

for name in repos:
    repo = git.Repo(name)
    repo.is_dirty()

    del repo

raw_input("Check open files again")                 # files are still open

I tried digging deeper down the GitPython stack, but couldn't find the actual cause.

In case that's helpful, below is the same as the previous snippet, but using directly the lowest-level gitdb objects I could find to open the files:

import os

import git
from gitdb.util import hex_to_bin

# Put here whatever Git repos you might have in the current folder
# Listing more than one makes the problem more visible
repos = ["ipset", "libhtp"]

raw_input("Check open files with `lsof -p %s | grep %s`" % (os.getpid(),
                                                            os.getcwd()))

raw_input("Check open files again")                 # files are still open

for name in repos:
    repo = git.Repo(name)
    sha = hex_to_bin("71acab3ca115b9ec200e440188181f6878e26f08")

    for db in repo.odb._dbs:
        try:
            for item in db._entities:               # db._entities opens *.pack files
                item[2](sha)                        # opens *.idx files

        except AttributeError as e:
            # Not all DBs have this attribute
            pass

    del repo

raw_input("Check open files again")                 # files are still open

But that's just where the files are open (well, in fact they are open even lower down but I didn't manage to go deeper), not where they are supposed to be closed.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions