This issue is mostly the 'meeting notes' of a long conversation between @mahaloz and I about binsync's current plugin management issues.
The Problem
There are multiple common issues with the current system as of: f6eaf70
- Plugin / core version desyncs. Right now the plugins are not really versioned, but we require their installed code to come from the same commit that core package came from.
- Installed in the wrong environment; it is a common problem to install binsync into the wrong python enviornment; as each plugin needs itself and a copy of binsync to be installed into the environment of the interpreter of the desired decompiler.
- Inflexibility in adding decompiler support; plugins must be first party.
- In ability to easily update the binsync of users running angr.
- People running old versions of binsync core despite upgraded plugins; and visa-versa.
- No ability to add a true plugin system to binsync due to the inability to reliably install python packages
- Difficulties with editable installs on Windows due to symlinks issues: @mahaloz
Multiple of the above issues compound. For example, multiple installations of binsync might be on different versions simultaneously without the user knowing. Another simpler example: updating may be avoided due to installation tribulations.
Why this is hard
- Each decompiler uses its own interpreter, each of which requires binsync support
- A decompiler's interpreter may be bundled or may be 'the first found system interpreter'
- Some interpreters may not come with
pip or any package managing tool
- Some decompilers (like single file
angr-management) do not have a site-packages directory to edit
- Some decompilers (like single file
angr-management) require binsync code be copied and included directly in the main project: https://github.com/angr/angr-management/tree/6751e9831e5758e74f8079f019731ca13e7d4741/angrmanagement/plugins/angr_binsync Thus updates must be done on the decompiler's end
- Installing and updating should be out-of-the-box trivially easy for inexperienced users; i.e. should be almost entirely automated and somewhat idiot proof (for end users).
- Developers should easily be able to edit / tinker on any given install
- Some decompilers do not use a fixed python interpreter; the version can change either because the system version changes, the user changes the version, etc.
- Users may accidentally clobber items within their environment
- Should work across OSes
- We intentionally store a global config that all binsync's share, and desire to continue to do so
- Uninstallation should be relatively easy
- Various decompilers require some sort of entry point for binsync to be installed; i.e. a user must tell IDA to execute the a specific function as a plugin, this needs to be automated as well.
- Some decompilers need additional non-python code (like Gihdra which currently needs https://github.com/mahaloz/binsync-ghidra-plugin/releases/tag/v1.1.0 )
Proposal Desires
- We want to use pip for packages; even if something wraps it, pip underneath
- Each plugin should become a stand-alone pip-installable package on
pypi.org; these packages would depend on binsync>=a.b.c as needed.
- On plugin install,
binsync or rather the plugin plugins, likely using a function defined in binsync, should install hook files into
each decompiler.*
binsync's API / library functions should be usable and ideally installable into other python environments without worrying about concurrent installs clobbering each other's global state.
- No longer require a version sync between binsync and plugins, we want a simply dependency.
- We are leaning towards namespaced packages for the various binsync packages to-be; but this is an implementation detail we can address later.
- A unified interface plugins can use to check for newer versions and warn the user, an interface binsync could use for itself (likely using the https://pypi.org/project/outdated/ package)
* This might be hard since this may require manually querying the user and we do not want people to have to do pip install binsync.plugins.ida && binsync install-hook ida separately if possible; likewise pip shouldn't have to query the user; so either pip would have to query the user, so we might want binsync to manage the plugin installation
The desired hook would look something like:
from pathlib import Path
import sys
# Adjust sys.path as needed
# sys.path.insert(0, "abc")
entrypoint = None
try:
import binsync.plugins.ida
binsync.plugins.ida.entrypoint()
except ModuleNotFoundError:
pass
# finally:
# Restore sys.path if desired
if entrypoint is not None:
entrypoint()
Each plugin would then have to define .entrypoint() which could be as simple as binsync.core.main(name, __version__) or something.
Proposals
The following proposals are listed in increasing order of complexity with later proposals building off the former proposals.
Per-interpreter user managed default environments
For each interpreter, the user would be expected to manually pip install binsync, the plugin for the given interpreter, and the plugin hook.
We do not want users manually installing a plugin hook themselves so we would expose a CLI like: binsync plugin hook ida
Downsides
- May not work for all decompilers; some vendor python interpreters without a pip package; we can potentially vendor our own pip or just use https://bootstrap.pypa.io/get-pip.py to bypass this issue.
- May not work for all decompilers; for (at least) some versions of angr-management will have no
site-packages directory to install our binsync libraries into.
- We do not want to require the user manage this for each interpreter, not manually.
- Ideally we would also prefer not to require an additional step after the install or along-with an uninstall or reinstall.
- This uses the default environment of the interpreter; things can clobber and users can uninstall dependencies accidentally; likewise if the decompiler interpreter changes
- We have distinct copies of binsync per-decompiler, potentially with version desyncs
- Because this is per-interpreter, updates are significantly more effort, as they must also be done per interpreter
Per-interpreter binsync managed default environments
For each interpreter, the user would pip install binsync then binsync install plugins.ida (or whichever decompiler is desired); this would wrap a pip install and installing the hook. binsync install would support -U for upgrade and -e for editable installs, as it would just pass these arguments along to pip.
A CLI possibility:
$ binsync --help
--version
--config # Print global config and path
list # Lists all installed packages (core, plugin.ida, etc)
install # `pip install` or update a binsync package; install hooks for programs as needed
--help
-e, --editable # pip install -e wrapper
--U, --update # pip install -U wrapper
arg [arg2...] # pip install wrapper
reinstall [arg...] # Reinstall a binsync package and hooks
uninstall [arg...] # Uninstall a binsync package
The install command would be interactive and might prompt a user for information; it would also save such information to the global config for use by reinstall and uninstall; for example, where a hook is installed.
In this case, plugins would:
- Be pip installable packages that should never be manually installed by the user.
- To avoid confusion, user installations should fail: perhaps they could fail if
PIP_INSTALL_BY_BINSYNC is not in the environment to achieve this?
- Depend on binsync as a package dependency
- Contain a custom
hook.py to be installed by binsync, or if not, binsync can use a default version
Downsides
- May not work for all decompilers; some vendor python interpreters without a pip package; we can potentially vendor our own pip or just use https://bootstrap.pypa.io/get-pip.py to bypass this issue.
- May not work for all decompilers; for (at least) some versions of angr-management will have no
site-packages directory to install our binsync libraries into.
- This uses the default environment of the interpreter; things can clobber and users can uninstall dependencies accidentally; likewise if the decompiler interpreter changes
- We have distinct copies of binsync per-decompiler, potentially with version desyncs
- Because this is per-interpreter, updates are significantly more effort, as they must also be done per interpreter
Single binsync CLI concurrently managing per-interpreter binsync-managed default environments
Building atop the binsync package manager concept, we additionally we break binsync out into:
binsync: A package containing the CLI / plugin-manager
binsync.api The binsync api / data interface (lets programs read binsync binary files, etc)
binsync.core: The core logic of binsync, the bit the core of binsync that plugins utilize; depends on binsync.api for writing out data files and such.
Using binsync to install / upgrade / uninstall things now does each environment in sync.
Version output may look like:
$ binsync --version
CLI: 1.0.1
Core: 3.0.1
Plugins:
ida: 2.0.4
angr: 5.2.9
Downsides
- May not work for all decompilers; some vendor python interpreters without a pip package; we can potentially vendor our own pip or just use https://bootstrap.pypa.io/get-pip.py to bypass this issue.
- May not work for all decompilers; for (at least) some versions of angr-management will have no
site-packages directory to install our binsync libraries into.
- This uses the default environment of the interpreter; things can clobber and users can uninstall dependencies accidentally; likewise if the decompiler interpreter changes
Single binsync CLI concurrently managing per-interpreter binsync-managed virtualenvs
Building upon the binsync package manager concept with a broken out binsync.api and binsync.core:
The binsync cli would create and install packages into per-interpreter virtualenvs ~/.binsync/ida/venv/ (for IDA, for example).
Decompiler hooks would import this code; it could be done by appending (or prepending) (temporarily or permanently) this to their sys.path, or perhaps with importlib; the method of importing is an implementation detail.
Benefits:
- These hooks would persist and remain functional even if a decompiler changed python interpreters.
- Decompilers that have no
site-pacakges directory to install binsync plugins to would be supported as they code installed in the virtualenv
Downsides
- May not work for all decompilers; some vendor python interpreters without a pip package; we can potentially vendor our own pip or just use https://bootstrap.pypa.io/get-pip.py to bypass this issue.
- All binsync packages would be installed here, thus there would be no need to keep synchronized multiple concurrent copies.
- Potential version clobbering issues: If we require
toml as a dependency of a plugin, and toml already exists in the interpreter's path, and has been loaded, ours in the virtualenv may be ignored, which might cause issues if we need specific features from specific versions.*
* This may not be an issue in practice; the current binsync and all of its plugins require python3.6 and has a small list of dependencies; this would also require a distinct module loaded into the decompiler that runs in the same interpreter, loads in before binsync, and requires a version of one of our dependencies that does not support 3.6. @mahaloz doubted we would have to worry about it but it is still a possible issue worth mentioning. We could also just vendor dependencies.
Single binsync CLI managing single binsync-managed virtualenv
Building upon the binsync virtualenv manager idea:
The binsync cli would create and install packages into a single virtualenv: ~/.binsync/venv/, which would again be hooked by the decompiler hooks.
Benefits over multiple virtualenvs:
- We no longer need to keep multiple version-synced copies of binsync.
- Since we aren't necessarily using the exact interpreter of the decompilers, we can utilize
pip
Technically our plugins might not support our given python version (i.e. theoretically pip could grab a version for the version of the interpreter using pip rather than the decompiler's).
This is really not an issue though, since the plugin to be loaded would be coded specifically to work with the given decompiler, so we simply choose not to require python3.8 if we know IDA might use 3.6.
We could also add runtime checks to prevent this, if desired.
Downsides
- Potential version clobbering issues: If we require
toml as a dependency of a plugin, and toml already exists in the interpreter's path, and has been loaded, ours in the virtualenv may be ignored, which might cause issues if we need specific features from specific versions.*
* This may not be an issue in practice; the current binsync and all of its plugins require python3.6 and has a small list of dependencies; this would also require a distinct module loaded into the decompiler that runs in the same interpreter, loads in before binsync, and uses a version of one of our dependencies that does not support 3.6. @mahaloz doubted we would have to worry about it but it is still a possible issue worth mentioning. We could also just vendor dependencies.
Overall
I think we should have binsync manager binsync packages; though the in my opinion the biggest question is whether we want:
- A CLI which manages the binsync installs across each interpreter environment
- A CLI that manages a singular binsync virtualenv that decompilers load code from
In my opinion, the biggest downside with a virtualenv is possible dependency version collisions due to our misuse of virtualenvs; though given the requirements to hit this issue, and after my conversation with @mahaloz it seems like that is a rather unlikely issue to hit. In that case, the benefits are a lack of keeping multiple version-synced copies of this package across multiple environments a user could easily mess up / clobber.
This issue is mostly the 'meeting notes' of a long conversation between @mahaloz and I about binsync's current plugin management issues.
The Problem
There are multiple common issues with the current system as of: f6eaf70
Multiple of the above issues compound. For example, multiple installations of binsync might be on different versions simultaneously without the user knowing. Another simpler example: updating may be avoided due to installation tribulations.
Why this is hard
pipor any package managing toolangr-management) do not have asite-packagesdirectory to editangr-management) require binsync code be copied and included directly in the main project: https://github.com/angr/angr-management/tree/6751e9831e5758e74f8079f019731ca13e7d4741/angrmanagement/plugins/angr_binsync Thus updates must be done on the decompiler's endProposal Desires
pypi.org; these packages would depend onbinsync>=a.b.cas needed.binsyncor rather the plugin plugins, likely using a function defined inbinsync, should install hook files intoeach decompiler.*
binsync's API / library functions should be usable and ideally installable into other python environments without worrying about concurrent installs clobbering each other's global state.* This might be hard since this may require manually querying the user and we do not want people to have to do
pip install binsync.plugins.ida && binsync install-hook idaseparately if possible; likewise pip shouldn't have to query the user; so eitherpipwould have to query the user, so we might wantbinsyncto manage the plugin installationThe desired hook would look something like:
Each plugin would then have to define
.entrypoint()which could be as simple asbinsync.core.main(name, __version__)or something.Proposals
The following proposals are listed in increasing order of complexity with later proposals building off the former proposals.
Per-interpreter user managed default environments
For each interpreter, the user would be expected to manually
pip install binsync, the plugin for the given interpreter, and thepluginhook.We do not want users manually installing a plugin hook themselves so we would expose a CLI like:
binsync plugin hook idaDownsides
site-packagesdirectory to install our binsync libraries into.Per-interpreter binsync managed default environments
For each interpreter, the user would
pip install binsyncthenbinsync install plugins.ida(or whichever decompiler is desired); this would wrap apip installand installing the hook.binsync installwould support-Ufor upgrade and-efor editable installs, as it would just pass these arguments along topip.A CLI possibility:
The
installcommand would be interactive and might prompt a user for information; it would also save such information to the global config for use byreinstallanduninstall; for example, where a hook is installed.In this case, plugins would:
PIP_INSTALL_BY_BINSYNCis not in the environment to achieve this?hook.pyto be installed by binsync, or if not, binsync can use a default versionDownsides
site-packagesdirectory to install our binsync libraries into.Single binsync CLI concurrently managing per-interpreter binsync-managed default environments
Building atop the binsync package manager concept, we additionally we break binsync out into:
binsync: A package containing the CLI / plugin-managerbinsync.apiThe binsync api / data interface (lets programs read binsync binary files, etc)binsync.core: The core logic of binsync, the bit the core of binsync that plugins utilize; depends onbinsync.apifor writing out data files and such.Using
binsyncto install / upgrade / uninstall things now does each environment in sync.Version output may look like:
Downsides
site-packagesdirectory to install our binsync libraries into.Single binsync CLI concurrently managing per-interpreter binsync-managed virtualenvs
Building upon the binsync package manager concept with a broken out
binsync.apiandbinsync.core:The
binsynccli would create and install packages into per-interpreter virtualenvs~/.binsync/ida/venv/(for IDA, for example).Decompiler hooks would
importthis code; it could be done by appending (or prepending) (temporarily or permanently) this to theirsys.path, or perhaps withimportlib; the method of importing is an implementation detail.Benefits:
site-pacakgesdirectory to install binsync plugins to would be supported as they code installed in the virtualenvDownsides
tomlas a dependency of a plugin, andtomlalready exists in the interpreter's path, and has been loaded, ours in the virtualenv may be ignored, which might cause issues if we need specific features from specific versions.** This may not be an issue in practice; the current binsync and all of its plugins require
python3.6and has a small list of dependencies; this would also require a distinct module loaded into the decompiler that runs in the same interpreter, loads in before binsync, and requires a version of one of our dependencies that does not support3.6. @mahaloz doubted we would have to worry about it but it is still a possible issue worth mentioning. We could also just vendor dependencies.Single binsync CLI managing single binsync-managed virtualenv
Building upon the binsync virtualenv manager idea:
The
binsynccli would create and install packages into a single virtualenv:~/.binsync/venv/, which would again be hooked by the decompiler hooks.Benefits over multiple virtualenvs:
pipTechnically our plugins might not support our given python version (i.e. theoretically pip could grab a version for the version of the interpreter using pip rather than the decompiler's).
This is really not an issue though, since the plugin to be loaded would be coded specifically to work with the given decompiler, so we simply choose not to require
python3.8if we know IDA might use3.6.We could also add runtime checks to prevent this, if desired.
Downsides
tomlas a dependency of a plugin, andtomlalready exists in the interpreter's path, and has been loaded, ours in the virtualenv may be ignored, which might cause issues if we need specific features from specific versions.** This may not be an issue in practice; the current binsync and all of its plugins require
python3.6and has a small list of dependencies; this would also require a distinct module loaded into the decompiler that runs in the same interpreter, loads in before binsync, and uses a version of one of our dependencies that does not support3.6. @mahaloz doubted we would have to worry about it but it is still a possible issue worth mentioning. We could also just vendor dependencies.Overall
I think we should have binsync manager binsync packages; though the in my opinion the biggest question is whether we want:
In my opinion, the biggest downside with a virtualenv is possible dependency version collisions due to our misuse of virtualenvs; though given the requirements to hit this issue, and after my conversation with @mahaloz it seems like that is a rather unlikely issue to hit. In that case, the benefits are a lack of keeping multiple version-synced copies of this package across multiple environments a user could easily mess up / clobber.