Howdy, I’m David B Heise and I work on the Office Security team responsible for testing Office File Validation (codename: Gatekeeper). There have been some misconceptions about the new file validation feature in Microsoft Office 2010 and I hope to clear these up and explain the why and what.
Why Validate Binary Files?
Throughout the years the office binary formats have necessarily evolved and grown in scale and complexity. The reasons why the formats are complex have been discussed sufficiently elsewhere (see Joel Spolsky's article here) so we won’t go into that discussion here, however these binary formats are very well documented here. We have found that malicious attackers use the binary files as an attack vector to infect a targeted user, as such we wanted to come up with a way to stop this from happening. One thing our team has been doing is whenever a new Office file format attack is reported to Microsoft we have been checking it with our validation to see how well we’re doing. So far, so very good!
What is The Gatekeeper?
Office File Validation is a feature that was originally introduced in Publisher 2007 to validate Publisher’s PUB files. It verifies that a particular binary file conforms to the application’s expectations. In Office 2010 we’ve expanded this feature significantly to include binary formats for Word, Excel, and PowerPoint. Please note that this feature is for binary formats ONLY (i.e. PUB, DOC, XLS, PPT, etc), this does not validate the XML based documents (i.e. DOCX, XLSX, PPTX, etc), nor does it validate macros or other custom items. What it does validate is the structure of the file, for example if you have a XLS file that has a FONTINDEX structure with the ifnt value set to 4 (which is an invalid value for that particular item) then it fails validation.
How Does It Work?
Whenever an un-trusted binary file (i.e. not in a trusted location and not a trusted document) is loaded by Word, PowerPoint, or Excel it goes through a check to see if it is a valid file. This check looks at the specific bits of the file that the application is about to parse, in other words the relevant OLESS Streams. If it is determined to be valid, it opens as normal, nothing to see…move along…move along. However if it is found to be invalid, it is sent (by default) to the Protected View.
If you click on that text you will be taken to the Backstage view where you will have the option to open the file in the full application experience. Please note that this is a trust decision that will mark this particular file as a trusted file, and as such, will NOT be validated the next time you open this file.
After you’re done with the file and close the application you may see a prompt like this:
This prompt only appears at most once every two weeks (per application) and gives you the option to send the failing file (or files) to us via Windows Error Reporting. Of course you can remove a file or two if you don’t want to share that information, but by sending us the file we can analyze it further to improve Office File Validation.
How do I control this?
Via Policy
We realize that many administrators (or security conscious users) may not like the idea of opening a file that fails validation, so there is a group policy to control the default action when a file fails validation. These policies are located under the application’s “Options\Security\Trust Center\Protected View” in the group policy templates and it is a per application setting.
Via the Registry
There are several registry keys that control various aspects of Office File Validation.
Common Keys
HKCU\Software\Microsoft\Office\14.0\Common\Security\FileValidation \ReportingInterval - This is a DWORD that controls the number of days between the showing of the dialog to send files to Windows Error Reporting.
HKCU\Software\Microsoft\Office\14.0\Common\Security\FileValidation\DisableReporting - This is a DWORD that if set to 1 will disable the showing dialog (and thus the sending of files) to Windows Error Reporting.
Application Specific Keys
For these examples I’m going to use “Excel”, but these also work for “PowerPoint” and “Word”
HKCU\Software\Microsoft\Office\14.0\Excel\Security\FileValidation\EnableOnLoad – This is a DWORD that if set to 0 Office will not validate files.
HKCU\Software\Microsoft\Office\14.0\Excel\Security\FileValidation\DisableEditFromPV – This is a DWORD that if set to 1 will not allow files to be edited that fail validation.
Excel Specific KeysVia Scripting
HKCU\Software\Microsoft\Office\14.0\Excel\Security\FileValidation\PivotOptions – This is a DWORD that controls specific options around validating pivot caches (for performance reasons) in files that have them.
0 = Never validate any pivot cache
1 = Validate the pivot cache in the following cases: (1) file is opened from the internet, and the platform marks the file locally as having come from the internet. (2) The file is a Microsoft Outlook email attachment. (3) The user specifically opened the file in protected view. (4) The file is opened from a known "unsafe location" locally where internet content is cached, and any special user-defined untrusted locations, unless protected view unsafe locations are disabled via (a different) registry key. (5)The file is opened and the pivot cache is parsed on load.
2 =Always validate all pivot caches
For custom solutions built on top of Office there are a few interesting properties that have been added to the Application Objects that will disable file validation for that session. There is also an extra option for Excel to control the validation of Pivot Caches (i.e. the file cached data for pivot tables and charts). Here’s a powershell script example showing how to set these two options for Excel (but the FileValidation property would also apply for Word and PPT):
$excel = New-Object -comobject Excel.ApplicationThat’s great, but does it Cook?
# valid values are:
# msoFileValidationDefault = 0
# msoFileValidationSkip = 1
$excel.FileValidation = msoFileValidationSkip
# valid values are:
# xlFileValidationPivotDefault = 0 (do whatever you’d normally do, i.e. follow registry & default settings),
# xlFileValidationPivotRun = 1 (validate all pivot caches),
# xlFileValidationPivotSkip = 2 (don’t validate any pivot caches)
$excel.FileValidationPivot = xlFileValidationPivotSkip
We have made specific strides to ensure that file validation is very fast. Yes, it now takes more time to open a file, but we’re generally talking milliseconds more. In fact, you’d be hard pressed to find a normal sized file that takes more than a second to validate, most files validate in the 1 to 100 milliseconds range. Of course if the file is huge and super complex and takes an hour to open already…then yes it will take more than a second, but you probably aren’t going to notice anyway. In addition to that if the file takes more than 5 seconds to validate (so we’re talking very complex files here) we give you the option to cancel and go straight to the Protected View. After all we couldn’t just let you open it normally because then hackers would just make a file that was really complex…then take over your machine, which is exactly what this feature is trying to stop.
In addition for any file that takes a long time to validate (if it passes validation, fails validation, or validation is skipped) will also be shown the same Windows Error Reporting prompt as a failing file; giving you the option to send us the file for further analysis.
In a Nutshell
In talking with the developers one day we imagined a conversation that went like this:
“So what have you been working on?”At the end of the day the Office File Validation is really just a Yes/No function to inform the application if a file is valid or not, but that’s a really important function! In fact is also a really complex function, as anyone who’s ever even peeked into the file format specifications can attest. So there you have it, in a nutshell. Office File Validation will check your binary file to ensure the significant bits of your file are valid, and if you think we’re wrong you can either trust the file or let us know!
“Office File Validation”
“What’s that?”
“A check on an Office file to make sure it’s ok”
“So, you spent the last two years writing a Boolean function?”
“Well…um…yes, but it’s an important function!”
More...