Skip to content

Use new export API#525

Merged
scribu merged 18 commits into
masterfrom
export-api
Dec 14, 2013
Merged

Use new export API#525
scribu merged 18 commits into
masterfrom
export-api

Conversation

@scribu
Copy link
Copy Markdown
Member

@scribu scribu commented Jun 16, 2013

The current code that implements wp export is a mess, because the WordPress export code is a mess.

@nb proposed a proper export API (he says it's already being used on wordpress.com).

Ticket: http://core.trac.wordpress.org/ticket/22435
Diff: nb/WordPress@master...export-api

If we namespace it properly, we can start using it before it lands in Core.

Tasks:

  • do something with the --verbose flag
  • replace VerboseExportWriter class with action inside WP_Export_Split_Files_Writer
  • add parameter for specifying filename format #issuecomment-20013095
  • call export_wp action again?
  • manually test with large export

@scribu
Copy link
Copy Markdown
Member Author

scribu commented Jun 16, 2013

@danielbachhuber Please give this a spin when you get the chance.

@danielbachhuber
Copy link
Copy Markdown
Member

Will do early next week

On Mon, Jun 17, 2013 at 2:39 AM, Cristi Burcă notifications@github.com
wrote:

@danielbachhuber Please give this a spin when you get the chance.

Reply to this email directly or view it on GitHub:
#525 (comment)

@scribu
Copy link
Copy Markdown
Member Author

scribu commented Jun 25, 2013

I find myself overusing git rebase too. It makes the history cleaner, but it's a fake history. Plus, you lose the build results for intermediate commits.

Let's make a deal pact: no more rebasing of branches. Instead, when a branch gets dirty outdated, we just merge master into it and fix the conflicts.

@scribu
Copy link
Copy Markdown
Member Author

scribu commented Jun 25, 2013

If a branch gets really outdated, we just abandon it and start a new one, with a different name.

@danielbachhuber
Copy link
Copy Markdown
Member

Let's make a deal: no more rebasing of branches. Instead, when a branch gets dirty, we just merge master into it and fix the conflicts.

Works for me. Hopefully this leads to smaller branches too, so there aren't a bunch of merge commits clogging the history :)

I took a brief look at this. It seems like a good improvement, but I'm not yet sold on it as a great improvement.

Some thoughts:

  • Does it properly handle large exports (e.g. 200MB+)? We had some object cache reset hack previously to avoid memory leaks that's been removed.
  • Seems much faster. Do we need the CLI progress indicators back?
  • I included the CLI arguments in the filename originally so you could easily create multiple export files on a single day without worrying about them overwriting each other. It would be nice to restore those.
  • Not sure how I feel about removing file_item_count and replacing it with splitting based on file size.

cc @nb

@scribu
Copy link
Copy Markdown
Member Author

scribu commented Jun 25, 2013

Does it properly handle large exports (e.g. 200MB+)? We had some object cache reset hack previously to avoid memory leaks that's been removed.

No idea; need to test.

Seems much faster. Do we need the CLI progress indicators back?

I'm not sure how easy it would be to add them back; we'd have to fork even more of the new export API.

I included the CLI arguments in the filename originally so you could easily create multiple export files on a single day without worrying about them overwriting each other. It would be nice to restore those.

I don't think including the CLI arguments verbatim was a good idea to start. We could add their md5 hash though.

Not sure how I feel about removing file_item_count and replacing it with splitting based on file size.

I think it makes sense; the file size is what you really care about when splitting.

@danielbachhuber
Copy link
Copy Markdown
Member

we'd have to fork even more of the new export API.

Can we instead ask Nikolay to include actions within the export API that achieve the end results we want? I haven't done a diff to see what we've changed, but my preference would be no forking at all.

I don't think including the CLI arguments verbatim was a good idea to start. We could add their md5 hash though.

Care to explain? Including them verbatim achieves two things:

  1. Tells me which export the file corresponds with.
  2. Prevents a second export from overwriting the first.

An md5 hash only achieves the second.

@scribu
Copy link
Copy Markdown
Member Author

scribu commented Jun 25, 2013

It's not a good idea because the arguments can get really long and might contain unusual characters (for a file name).

I understand it's convenient, but it doesn't justify the potential headaches.

@danielbachhuber
Copy link
Copy Markdown
Member

It's not a good idea because the arguments can get really long and might contain unusual characters (for a file name).

Let that become a problem when it becomes a problem. I was sanitizing the filename previously.

@scribu
Copy link
Copy Markdown
Member Author

scribu commented Jun 25, 2013

Copying the relevant code chunk here, for convenience:

$append = array( date( 'Y-m-d' ) );
foreach( array_keys( $args ) as $arg_key ) {
    if ( $defaults[$arg_key] <> $args[$arg_key] && 'post__in' != $arg_key )
        $append[]= "$arg_key-" . (string) $args[$arg_key];
}
$file_name_base = sanitize_file_name( $sitename . 'wordpress.' . implode( ".", $append ) );

Still not sold; it would be cleaner to allow the user to specify an "ID" for a batch of export files. For example:

wp export --batch-id=events-for-x --post_type=event

which would generate:

testground.wordpress.2013-06-events-for-x.0.xml
testground.wordpress.2013-06-events-for-x.1.xml
testground.wordpress.2013-06-events-for-x.2.xml

Or even allow them control of the entire filename format:

wp export --filename-format='events-for-x-{site}-{date}-{nr}.xml' --post_type=event

@scribu
Copy link
Copy Markdown
Member Author

scribu commented Jun 25, 2013

I guess my beef with the old code is that 1) the filenames generated by default are too dynamic and 2) you can't change the default.

@nb
Copy link
Copy Markdown

nb commented Jun 25, 2013

I am cool with adding actions and filters, I almost haven't thought how it will be extended, so I almost haven't added actions or filters.

@scribu
Copy link
Copy Markdown
Member Author

scribu commented Jul 5, 2013

Added 'wp_export_new_file' hook: 743092d

Suggestions for better name and/or better placement (while achieving similar functionality) welcome.

@scribu
Copy link
Copy Markdown
Member Author

scribu commented Dec 14, 2013

Guys, I'm going to merge this pretty much as-is.

My hunch is that people will be a lot more likely to submit incremental improvements once they see that the code is relatively clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants