Apparently, we’re releasing a service called “Word Automation Services” specifically for Sharepoint 2010 that is specifically designed to address one major area of server-side usage of Word: Document format conversion.
The Word Automation Services is a new shared service in SharePoint Server 2010. The Word Automation Services provides unattended, server-side conversion of documents into formats that are supported by the Microsoft Word client application.
In simplest terms, the Word Automation Services takes the "Save As…" functionality of the Word client application and replicates it for the server. Specifically, the Word Automation Services provides the following capabilities:
-
Opens documents that Word can open, including:
-
Open File Format documents (.docx, .docm, .dotx, .dotm).
-
Word 97-2003 documents (.doc, .dot).
-
Rich Text Format files (.rtf).
-
Single File Web Pages (.mht, .mhtml).
-
Word 2003 XML Documents (.xml).
-
Word XML Document (.xml).
-
Supports all automatic tasks that execute when a document opens, such as:
-
Updating the Table of Contents, the Table of Authorities, and index fields.
-
Recalculating all field types.
-
XML mapping.
-
Merging of "alternate format chunks".
-
Setting the compatibility mode to the latest version or to previous versions of Word.
Saves documents types that Word can save. This list is identical to the previous list of files that the Word Automation Services can open, but also includes the following types:
-
Portable Document Format (PDF) files.
-
XML Paper Specification (XPS) files.
With the Word Automation Services, many of the tasks that previously required you to run the Word client application can now be automated to run unattended in a more reliable and scalable way than in previous solutions.
THE WORD BLOG ANNOUNCEMENT
This is what the Word blog has to say on the topic:
Have you ever wanted to convert .docx files into PDF? We’ve heard from many customers trying to perform server side conversions of Open XML files (.docx) into fixed formats (PDF and XPS) using the Word desktop application, and that’s what motivated us to create Word Automation Services.
As a component of SharePoint 2010, Word Automation Services allows you to perform file operations on the server that previously required automating desktop Word:
- Converting between document formats (e.g. DOC to DOCX)
- Converting to fixed formats (e.g. PDF or XPS)
- Updating fields
- Importing "alternate format chunks"
- Etc.
If you’ve done any automation of Word, you’re probably familiar with the challenges of doing so – challenges well documented by this Knowledge Base article: http://support.microsoft.com/kb/257757. With Word Automation Services, those challenges are a thing of the past:
- Reliability – The service was built from the ground up to work in a server environment, which means that you no longer have to worry about issues like dialog boxes that bring the process to a halt, expecting a user to provide input; creating interactive user accounts under which to run the application to avoid running into permissions issues, etc.
- Speed – The service is optimized to perform server-side file operations, and in doing so provides performance significantly better than existing solutions.
- Scalability – The service can take advantage of the processing power available on typical server hardware (multiple processors, additional memory). For example, although a single instance of WINWORD.EXE can only utilize a single core of processing power, with Word Automation Services, you can specify the number of simultaneous conversions (and the # of processing cores) to use based on the available hardware.
And you still have a solution that has 100% fidelity with respect to the Word desktop application – documents are paginated the same way on the server as they are on the client, ensuring that what you see on the client is what you get from the server.
Word Automation Services and the Open XML SDK: Better Together
One of the most important things to understand about the service is what it doesn’t do: this service is not intended to be a 1:1 replacement for the existing desktop object model.
Instead, the server is one half of a replacement for the existing object model – the other half being the Open XML SDK.
- The SDK is designed to handle tasks that don’t require application logic, such as inserting or deleting content (paragraphs, tables, pictures), inserting data from other data sources, sanitizing content (removing content, accepting tracked changes), etc.
- The service is designed to handle those few tasks that do need application logic: reading all of the document formats that Word supports, converting to all of the output format that Word supports, recalculating dynamic fields, etc.
The two halves together enable the creation of rich, end-to-end solutions that never require automation of the client applications, yet sacrifice none of its capabilities – another topic we’ll discuss in more detail in the future.
