RSS

Monthly Archives: October 2013

Merging Word documents … What a pain

Merging Word documents … What a pain

For the umpteenth time in my career I had a document in MS Word (2010 in this case) that was reviewed by multiple individuals (about 8) with changes to formatting, content, etc. along with reviewer comments. These then had to be merged back into a single document, which is certainly easier than it used to be – but is still incredibly inefficient. Let’s look at the normal flow:

  1. Document Created (O1)
  2. Send to Review (email or ??)
  3. Receive n copies of the document back (R1 – Rn)
  4. Make a copy of the original doc (O2) and combine with the revised copy R1 [in the original document]
  5. Save O2 and close both files (or all of the merge panes).
  6. Select the combine option again, find O2 and merge in R2.
  7. Rinse and repeat steps 5 & 6 through Rn.

Certainly using the version tracking of SharePoint 2010/2013 would be considerably easier while also allowing for concurrent editing, but that’s not always an option. Thinking of a large organization where this is done many hundreds of times each day, by thousands of employees, I was surprised that a small utility to perform the function of “merge these N files with O1” didn’t pop up on a quick Google/StackOverflow search. In my one instance, I missed a file … or messed something up the first time through which killed about 20 minutes of my day that could have been otherwise productive by actually addressing the commentary. To that end I decided to do something about it which I’ll share in source once I’ve gone through it enough that it wouldn’t be embarrassing.  It’s gotten a little bit cleaner, and a whole lot faster – but since it’s based on InterOp it’s a temporary solution until the features is fully implemented in Eric White’s EXCELLENT OpenXmlPowerTools library (link to the issue).

For the time being we’ll just call this WordMerge, and runs as a standalone executable. Not a heap of validation and error checking in place at the moment (e.g. bad output path could be typed in which will throw a general exception), but it has been through the paces and works pretty well up to the 50 documents I tried. For my use, I’m perfectly happy with how it runs – but if anyone else grabs the concept before I get code out there; I’m open to suggestions (a SharePoint extension, Office Plugin, and a WPF version are already planned BTW).

You can see the interface in the screenshot below … not much to explain.

wordmerge screenshot

Certainly, it isn’t foolproof to merge everything in at once. If a person moved a lot of content, and other reviewers modified that content; MS Word would be unaware of the “right” order to resolve conflicts. In these cases there may be straggling words that weren’t part of the move (esp if the move was processed after the changes in the merge order), which I’ve tried to address by allowing a re-order in the merged documents list.  That notwithstanding, thankfully the ability to toggle reviewers and types of changes in the review pane makes reconciling a much simpler process.

I’d be interested in how you normally handle disparate review documents … one at a time and copy what you’d like to keep over, merge them all by hand and accept/reject change, or something else entirely?  I’ll pull thoughts into the merger and possibly tie it with some other work I’m contemplating that uses NLP parsing systems in conjunction with context-free grammar generation to assist edits and rewrites from a Knowledge Management repository.  That’s definitely further out on the horizon though.  Next up is a syntax highlighter for Word – simply because I’m tired of the wasted time and inconsistent formatting in software documentation that inevitably results if you’ve got to embed source.  If there are suggestions for that before I get it released, feel free to send them my way.

Grab it, use it, save yourself some headache. If you’ve got feedback for me, let me know.

//Levii

 
 
%d bloggers like this: