Skip to content

Remove duplicate code in hierarchical collections.#209

Open
player-03 wants to merge 1 commit into
feathersui:masterfrom
player-03:deduplicate_hierarchical_collections
Open

Remove duplicate code in hierarchical collections.#209
player-03 wants to merge 1 commit into
feathersui:masterfrom
player-03:deduplicate_hierarchical_collections

Conversation

@player-03
Copy link
Copy Markdown
Contributor

Because shorter functions are easier to read. I tried to only remove lines that exactly matched, not things like TreeCollection.getLength() that look like duplicate code but actually operate on different types.

@player-03
Copy link
Copy Markdown
Contributor Author

Update: TreeCollection.getLength() is fixable too. Should I add that to this PR?

@joshtynjala
Copy link
Copy Markdown
Member

Perhaps ironically, the minor duplication is intended to make these branches easier to understand. It makes each condition a bit more self-contained, and you don't need to keep track of shared context from outside the conditions.

If I were to change anything to make it easier to understand, I'd be more inclined to move the code from the if (this._filterAndSortData != null) parts into separate functions (possibly with inline, so that there's no risk of hurting performance). Something more like this:

public function get(location:Array<Int>):T {
	if (this._pendingRefresh) {
		this.refreshFilterAndSort();
	}
	if (location == null || location.length == 0) {
		throw new RangeError('Item not found at location: ${location}');
	}
	if (this._filterAndSortData != null) {
		return this.getWithFilterAndSort(location);
	}
	var branchChildren = this.findBranchChildren(this._array, this._itemToChildren, location);
	var index = location[location.length - 1];
	if (index < 0 || index >= branchChildren.length) {
		throw new RangeError('Item not found at location: ${location}');
	}
	return branchChildren[index];
}

private inline function getWithFilterAndSort(location:Array<Int>):T {
	var branchChildren = this.findBranchChildren(this._filterAndSortData, this.filterAndSortDataItemToChildren, location);
	var index = location[location.length - 1];
	if (index < 0 || index >= branchChildren.length) {
		throw new RangeError('Item not found at location: ${location}');
	}
	return branchChildren[index].item;
}

@player-03
Copy link
Copy Markdown
Contributor Author

I'm not sure it makes it easier to understand. When I see two similar blocks of code I assume there's a significant difference to look out for. Otherwise it's just clutter: I have to read through twice as much code looking for differences that aren't there. It doesn't quite take twice as long, but it takes longer than reading a single slightly-more-complex block.

I do support creating functions to help organize, but it's even better if the function can reduce the clutter, rather than just moving it. Actually, moving it might make things worse, since I'll have to scroll back and forth to look for differences.


Honestly, it'd be nice to remove the duplicate blocks completely, rather than just factoring out the common code as I did here.

findBranchChildren() is a good example of what I'd go for, it can handle either type of array, and removes a lot of duplicate code. It takes itemToChildren to handle the differences between Array<T> and Array<FilterAndSortItem<T>>, which frees it to focus on the shared logic.

You could use the exact same approach in other places too:

public function get(location:Array<Int>):T {
	if (this._pendingRefresh) {
		this.refreshFilterAndSort();
	}
	if (location == null || location.length == 0) {
		throw new RangeError('Item not found at location: ${location}');
	}
	if (this._filterAndSortData != null) {
		return this.getBranchChild(this._filterAndSortData, this.filterAndSortDataItemToChildren, this.filterAndSortDataArrayIndexToItem, location);
	}
	return this.getBranchChild(this._array, this._itemToChildren, this._arrayIndexToItem, location);
}

private function getBranchChild<U>(source:Array<U>, itemToChildren:(U) -> Array<U>, arrayIndexToItem:(Array<U>, Int) -> T, location:Array<Int>):T {
	var branchChildren = this.findBranchChildren(source, itemToChildren, location);
	var index = location[location.length - 1];
	if (index < 0 || index >= branchChildren.length) {
		throw new RangeError('Item not found at location: ${location}');
	}
	return arrayIndexToItem(branchChildren, index);
}

private function arrayIndexToItem(array:Array<T>, index:Int):T {
	return array[index];
}

private function filterAndSortDataArrayIndexToItem(array:Array<FilterAndSortItem<T>>, index:Int):T {
	return array[index].item;
}

This won't handle everything, though. For instance, set() has some non-parallel logic, where it can dispatch one of three events if filtering is enabled, but only one in either of the other cases. I've got more ideas for that, and I'll write them up later. No rush until then.

@joshtynjala
Copy link
Copy Markdown
Member

I'm sorry, but I find your newly suggested code much more difficult to read. I felt my eyes frequently darting around to read various function signatures and type parameters, all while trying not to topple my mental representation of the variables/scopes in the call stack as I jumped from function to function running the code in my head.

When I write some code, I try to make sure that, when I read that same code in the future, I'm not juggling too much of that sort of state in my head.

That being said, what I just said about jugging state isn't too different than what you've explained. You seem to be having difficulty keeping track of state in a different way than I'm having difficulty. I don't know how to resolve that difference to make both of us satisfied, though.

@player-03
Copy link
Copy Markdown
Contributor Author

Well that's no good! Of course you'd have to be able to read it. Funny thing is, that was supposed to be the simple example, before I got into bigger changes.

To help me understand the problem, what makes the above different from the existing ArrayHierarchicalCollection.get() implementation? (Presented here in the same format.)

private var _itemToChildren:(T) -> Array<T>;

public function get(location:Array<Int>):T {
	if (this._pendingRefresh) {
		this.refreshFilterAndSort();
	}
	if (location == null || location.length == 0) {
		throw new RangeError('Item not found at location: ${location}');
	}
	if (this._filterAndSortData != null) {
		var branchChildren = this.findBranchChildren(this._filterAndSortData, this.filterAndSortDataItemToChildren, location);
		var index = location[location.length - 1];
		if (index < 0 || index >= branchChildren.length) {
			throw new RangeError('Item not found at location: ${location}');
		}
		return branchChildren[index].item;
	}
	var branchChildren = this.findBranchChildren(this._array, this._itemToChildren, location);
	var index = location[location.length - 1];
	if (index < 0 || index >= branchChildren.length) {
		throw new RangeError('Item not found at location: ${location}');
	}
	return branchChildren[index];
}

private function findBranchChildren<U>(source:Array<U>, itemToChildren:(U) -> Array<U>, location:Array<Int>):Array<U> {
	var branchChildren = source;
	for (i in 0...location.length - 1) {
		var index = location[i];
		if (index < 0 || index >= branchChildren.length) {
			throw new RangeError('Item not found at location: ${location}');
		}
		var child = branchChildren[index];
		branchChildren = (itemToChildren != null) ? itemToChildren(child) : null;
		if (branchChildren == null) {
			throw new RangeError('Item not found at location: ${location}');
		}
	}
	return branchChildren;
}

private function filterAndSortDataItemToChildren(item:FilterAndSortItem<T>):Array<FilterAndSortItem<T>> {
	return item.children;
}

@joshtynjala
Copy link
Copy Markdown
Member

Yes, that existing code has the same issues. I suspect that I ended up writing it that way because even I have some limit where the amount of duplication becomes too cumbersome and the extra abstraction helps to alleviate that.

@joshtynjala
Copy link
Copy Markdown
Member

even I have some limit where the amount of duplication becomes too cumbersome

Just adding some actual numbers.

findBranchChildren() is currently called from 16 different places in ArrayHierarchicalCollection. Its body is 12 lines long. That would be a lot of duplicated code to maintain and keep in sync, so it's worth the overhead of the abstraction required to support both the base case and the special filterAndSortData case. It helps a lot that itemToChildren was already required for the base case, so filterAndSortDataItemToChildren didn't really make that function more complicated than it already had to be.

On the other hand, the proposed getBranchChild() might only be called from 2 places. I took a quick glance at the other methods, and I don't think that they would be able to make use of getBranchChild(). They'd need their own new methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants