Refactor installation and plugin management system

This issue is mostly the 'meeting notes' of a long conversation between @mahaloz and I about binsync's current plugin management issues.

# The Problem

There are multiple common issues with the current system as of: https://github.com/angr/binsync/commit/f6eaf70a2504fb3b46e2b51f3a76261cc4737ceb
1. Plugin / core version desyncs. Right now the plugins are not really versioned, but we require their installed code to come from the same commit that core package came from.
2. Installed in the wrong environment; it is a common problem to install binsync into the wrong python enviornment; as each plugin needs itself and a copy of binsync to be installed into the environment of the interpreter of the desired decompiler.
3. Inflexibility in adding decompiler support; plugins must be first party.
4. In ability to easily update the binsync of users running angr.
5. People running old versions of binsync core despite upgraded plugins; and visa-versa.
6. No ability to add a true plugin system to binsync due to the inability to reliably install python packages
7. Difficulties with editable installs on Windows due to symlinks issues: @mahaloz 

Multiple of the above issues compound. For example, multiple installations of binsync might be on different versions simultaneously without the user knowing. Another simpler example: updating may be avoided due to installation tribulations.

## Why this is hard

1. Each decompiler uses its own interpreter, each of which requires binsync support
2. A decompiler's interpreter may be bundled or may be 'the first found system interpreter'
3. Some interpreters may not come with `pip` or any package managing tool
4. Some decompilers (like single file `angr-management`) do not have a `site-packages` directory to edit
5. Some decompilers (like single file `angr-management`) require binsync code be copied and included directly in the main project: https://github.com/angr/angr-management/tree/6751e9831e5758e74f8079f019731ca13e7d4741/angrmanagement/plugins/angr_binsync Thus updates must be done on the decompiler's end
8. Installing and updating should be out-of-the-box trivially easy for inexperienced users; i.e. should be almost entirely automated and somewhat idiot proof (for end users). 
9. Developers should easily be able to edit / tinker on any given install
10. Some decompilers do not use a fixed python interpreter; the version can change either because the system version changes, the user changes the version, etc.
11. Users may accidentally clobber items within their environment
12. Should work across OSes
13. We intentionally store a global config that all binsync's share, and desire to continue to do so
14. Uninstallation should be relatively easy
15. Various decompilers require some sort of entry point for binsync to be installed; i.e. a user must tell IDA to execute the a specific function as a plugin, this needs to be automated as well.
16. Some decompilers need additional non-python code (like Gihdra which currently needs https://github.com/mahaloz/binsync-ghidra-plugin/releases/tag/v1.1.0 )

# Proposal Desires

1. We want to use pip for packages; even if something wraps it, pip underneath
2. Each plugin should become a stand-alone pip-installable package on `pypi.org`; these packages would depend on `binsync>=a.b.c` as needed.
3. On plugin install, `binsync` or rather the plugin plugins, likely using a function defined in `binsync`, should install hook files into 
each decompiler.**\***
4. `binsync`'s API / library functions should be usable and ideally installable into other python environments without worrying about concurrent installs clobbering each other's global state.
5. No longer require a version sync between binsync and plugins, we want a simply dependency.
6. We are leaning towards namespaced packages for the various binsync packages to-be; but this is an implementation detail we can address later.
7. A unified interface plugins can use to check for newer versions and warn the user, an interface binsync could use for itself (likely using the https://pypi.org/project/outdated/ package)

**\*** This might be hard since this may require manually querying the user and we do not want people to have to do `pip install binsync.plugins.ida && binsync install-hook ida` separately if possible; likewise pip shouldn't have to query the user; so either `pip` would have to query the user, so we might want `binsync` to manage the plugin installation

The desired hook would look something like:
```python
from pathlib import Path
import sys
# Adjust sys.path as needed
# sys.path.insert(0, "abc")

entrypoint = None
try:
    import binsync.plugins.ida
    binsync.plugins.ida.entrypoint()
except ModuleNotFoundError:
    pass
# finally:
# Restore sys.path if desired

if entrypoint is not None:
    entrypoint()
``` 

Each plugin would then have to define `.entrypoint()` which could be as simple as `binsync.core.main(name, __version__)` or something.

# Proposals

The following proposals are listed in increasing order of complexity with later proposals building off the former proposals.

## Per-interpreter user managed default environments

For each interpreter, the user would be expected to manually `pip install binsync`, the plugin for the given interpreter, and the `plugin` hook.
We do not want users manually installing a plugin hook themselves so we would expose a CLI like: `binsync plugin hook ida`

#### Downsides

1. **May not work for all decompilers**; some vendor python interpreters without a pip package; we can potentially vendor our own pip or just use https://bootstrap.pypa.io/get-pip.py to bypass this issue.
2. **May not work for all decompilers**; for (at least) some versions of angr-management will have no `site-packages` directory to install our binsync libraries into.
3. We do not want to require the user manage this for each interpreter, not manually.
4. Ideally we would also prefer not to require an additional step after the install or along-with an uninstall or reinstall.
5. This uses the default environment of the interpreter; things can clobber and users can uninstall dependencies accidentally; likewise if the decompiler interpreter changes 
6. We have distinct copies of binsync per-decompiler, potentially with version desyncs
8. Because this is per-interpreter, updates are significantly more effort, as they must also be done per interpreter

## Per-interpreter binsync managed default environments

For each interpreter, the user would `pip install binsync` then `binsync install plugins.ida` (or whichever decompiler is desired); this would wrap a `pip install` and installing the hook. `binsync install` would support `-U` for upgrade and `-e` for editable installs, as it would just pass these arguments along to `pip`.

A CLI possibility:
```bash
$ binsync --help
  --version
  --config   # Print global config and path

  list       # Lists all installed packages (core, plugin.ida, etc)
  install  # `pip install` or update a binsync package; install hooks for programs as needed
    --help
    -e, --editable  # pip install -e wrapper
    --U, --update   # pip install -U wrapper
    arg [arg2...]   # pip install wrapper
  reinstall [arg...]  # Reinstall a binsync package and hooks
  uninstall [arg...]  # Uninstall a binsync package
```

The `install` command would be interactive and might prompt a user for information; it would also save such information to the global config for use by `reinstall` and `uninstall`; for example, where a hook is installed.

In this case, plugins would:
1. Be pip installable packages that should never be manually installed by the user.
2. To avoid confusion, user installations should fail: perhaps they could fail if `PIP_INSTALL_BY_BINSYNC` is not in the environment to achieve this?
3. Depend on binsync as a package dependency
4. Contain a custom `hook.py` to be installed by binsync, or if not, binsync can use a default version

#### Downsides

1. **May not work for all decompilers**; some vendor python interpreters without a pip package; we can potentially vendor our own pip or just use https://bootstrap.pypa.io/get-pip.py to bypass this issue.
4. **May not work for all decompilers**; for (at least) some versions of angr-management will have no `site-packages` directory to install our binsync libraries into.
5. This uses the default environment of the interpreter; things can clobber and users can uninstall dependencies accidentally; likewise if the decompiler interpreter changes 
6. We have distinct copies of binsync per-decompiler, potentially with version desyncs
7. Because this is per-interpreter, updates are significantly more effort, as they must also be done per interpreter

## Single binsync CLI concurrently managing per-interpreter binsync-managed default environments

Building atop the binsync package manager concept, we additionally we break binsync out into:
1. `binsync`: A package containing the CLI / plugin-manager
2. `binsync.api` The binsync api / data interface (lets programs read binsync binary files, etc)
3. `binsync.core`: The core logic of binsync, the bit the core of binsync that plugins utilize; depends on `binsync.api` for writing out data files and such.

Using `binsync` to install / upgrade / uninstall things now does each environment in sync.
Version output may look like:
```bash
$ binsync --version
CLI: 1.0.1
Core: 3.0.1
Plugins:
  ida: 2.0.4
  angr: 5.2.9
```

#### Downsides

1. **May not work for all decompilers**; some vendor python interpreters without a pip package; we can potentially vendor our own pip or just use https://bootstrap.pypa.io/get-pip.py to bypass this issue.
2. **May not work for all decompilers**; for (at least) some versions of angr-management will have no `site-packages` directory to install our binsync libraries into.
3. This uses the default environment of the interpreter; things can clobber and users can uninstall dependencies accidentally; likewise if the decompiler interpreter changes 

## Single binsync CLI concurrently managing per-interpreter binsync-managed virtualenvs

Building upon the binsync package manager concept with a broken out `binsync.api` and `binsync.core`:

The `binsync` cli would create and install packages into per-interpreter virtualenvs `~/.binsync/ida/venv/` (for IDA, for example).
Decompiler hooks would `import` this code; it could be done by appending (or prepending) (temporarily or permanently) this to their `sys.path`, or perhaps with `importlib`; the method of importing is an implementation detail.

Benefits:
1. These hooks would persist and remain functional even if a decompiler changed python interpreters.
2. Decompilers that have no `site-pacakges` directory to install binsync plugins to would be supported as they code installed in the virtualenv

#### Downsides

1. **May not work for all decompilers**; some vendor python interpreters without a pip package; we can potentially vendor our own pip or just use https://bootstrap.pypa.io/get-pip.py to bypass this issue.
2. All binsync packages would be installed here, thus there would be no need to keep synchronized multiple concurrent copies.
3. Potential version clobbering issues: If we require `toml` as a dependency of a plugin, and `toml` already exists in the interpreter's path, and has been loaded, ours in the virtualenv may be ignored, which might cause issues if we need specific features from specific versions.**\***

**\*** This may not be an issue in practice; the current binsync and all of its plugins require `python3.6` and has a small list of dependencies; this would also require a distinct module loaded into the decompiler that runs in the same interpreter, loads in before binsync, and requires a version of one of our dependencies that does not support `3.6`. @mahaloz doubted we would have to worry about it but it is still a possible issue worth mentioning. We could also just vendor dependencies.

## Single binsync CLI managing single binsync-managed virtualenv

Building upon the binsync virtualenv manager idea:

The `binsync` cli would create and install packages into a single virtualenv: `~/.binsync/venv/`, which would again be hooked by the decompiler hooks.

Benefits over multiple virtualenvs:
1. We no longer need to keep multiple version-synced copies of binsync.
2. Since we aren't necessarily using the exact interpreter of the decompilers, we can utilize `pip`

Technically our plugins might not support our given python version (i.e. theoretically pip could grab a version for the version of the interpreter using pip rather than the decompiler's).
This is really not an issue though, since the plugin to be loaded would be coded specifically to work with the given decompiler, so we simply choose not to require `python3.8` if we know IDA might use `3.6`.
We could also add runtime checks to prevent this, if desired.

#### Downsides

1. Potential version clobbering issues: If we require `toml` as a dependency of a plugin, and `toml` already exists in the interpreter's path, and has been loaded, ours in the virtualenv may be ignored, which might cause issues if we need specific features from specific versions.**\***

**\*** This may not be an issue in practice; the current binsync and all of its plugins require `python3.6` and has a small list of dependencies; this would also require a distinct module loaded into the decompiler that runs in the same interpreter, loads in before binsync, and uses a version of one of our dependencies that does not support `3.6`. @mahaloz doubted we would have to worry about it but it is still a possible issue worth mentioning. We could also just vendor dependencies.

# Overall

I think we should have binsync manager binsync packages; though the in my opinion the biggest question is whether we want:
1. A CLI which manages the binsync installs across each interpreter environment
2. A CLI that manages a singular binsync virtualenv that decompilers load code from

In my opinion, the biggest downside with a virtualenv is possible dependency version collisions due to our misuse of virtualenvs; though given the requirements to hit this issue, and after my conversation with @mahaloz it seems like that is a rather unlikely issue to hit. In that case, the benefits are a lack of keeping multiple version-synced copies of this package across multiple environments a user could easily mess up / clobber.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor installation and plugin management system #219

The Problem

Why this is hard

Proposal Desires

Proposals

Per-interpreter user managed default environments

Downsides

Per-interpreter binsync managed default environments

Downsides

Single binsync CLI concurrently managing per-interpreter binsync-managed default environments

Downsides

Single binsync CLI concurrently managing per-interpreter binsync-managed virtualenvs

Downsides

Single binsync CLI managing single binsync-managed virtualenv

Downsides

Overall

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor installation and plugin management system #219

Description

The Problem

Why this is hard

Proposal Desires

Proposals

Per-interpreter user managed default environments

Downsides

Per-interpreter binsync managed default environments

Downsides

Single binsync CLI concurrently managing per-interpreter binsync-managed default environments

Downsides

Single binsync CLI concurrently managing per-interpreter binsync-managed virtualenvs

Downsides

Single binsync CLI managing single binsync-managed virtualenv

Downsides

Overall

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions