TECH

How to Use Spotlight Metadata File Utilities on macOS

Spotlight metadata.

Spotlight is Apple's smart search indexer for macOS. Here's how to use metadata utilities to get more information about your documents.

Spotlight runs in the background on your Mac or iOS device and automatically indexes and scans the contents of your documents, so when you search for something, it can quickly find the results.

The main background daemon in Spotlight is called corespotlightd, and under full load it can consume up to 8-10% of CPU time.

On Apple Silicon Macs, corespotlightd can run up to four threads simultaneously during peak background indexing.

If you're an Apple developer, you can add the Core Spotlight framework to your app and have it index your app content internally so that the content is automatically made available to app users.

The Foundation platform includes additional Spotlight APIs that let you perform local searches of Spotlight data from within your application.

You'll also want to add Core Services.framework to your Xcode project, since that's where the file metadata APIs are located.

There are also iCloud Spotlight features that we're not covering here.

Add CoreSpotlight.framework to Xcode.

Configuring indexed volumes

On macOS, you can specify which volumes you want Spotlight to index and which you don't. By default, unless you exclude volumes from Spotlight, they will be indexed.

If you exclude volumes from the Spotlight index, their contents will not appear in Spotlight search results.

If your Mac has multiple volumes on its drive, or you have external drives connected, you can turn Spotlight (and Siri) on or off on each one.

To do this, first open System Preferences in macOS by choosing System Preferences from the Apple menu in the Finder menu bar.

On the left side of System Preferences, scroll down and tapSiri & In the spotlight In Siri & In the Spotlight panel, you can turn Siri on or off, set keyboard shortcuts, set the language, and set how Siri handles history.

Below is the Spotlight section. Here you can specify what types of documents you want Spotlight to index.

If you disable a specific document or data type in this section, Spotlight will ignore all documents or data of that type during indexing.

If you scroll to the bottom of the panel, you will see a button labeled Spotlight Privacy. Click on it to open the privacy sheet.

Siri & The Spotlight settings panel.

The Privacy Sheet contains a list of all storage volumes that Spotlight currently excludes from indexing, which in most cases defaults to either no volumes or only the startup disk.

To add or remove volumes from a privacy sheet, you can drag them into or out of it, or click the + or buttons below the list.

Spotlight Privacy Sheet.

Once a volume is added to the list, Spotlight stops indexing it.

When you are satisfied with the list of privacy exceptions, click Done to close the sheet. Close system settings.

File metadata

When corespotlightd indexes your volume data, it searches not only the contents of the files, but also the contents and indexes metadata. Metadata can be defined as information data associated with files, but not contained in the files themselves.

Metadata includes (but is not limited to) information such as the file's creation and last modification date, size, version, type, name, and Finder comments displayed in information windows.

Spotlight uses the File Metadata API in the Apple Core Services platform to search and read metadata.

There are four main data types in the File Metadata API:

  1. MDSchema
  2. MDItem
  3. MDLabelDomain
  4. MDQuerySortOptionFlags
  5. >

We won't go into all the details of data types, but the main type that stores a reference to a file system item and its metadata is the MDItem type.

Using the MDItem and Core Services APIs, you can retrieve, sort, and store metadata for items on local file systems.

There is also an older Apple document called the File Metadata Search Programming Guide that describes how to use the Spotlight API to search file metadata.

Spotlight Importers

If you open the /Library/Spotlight folder on your Mac's startup drive, you may notice one or several files with the extension .mdimporter. These are Spotlight metadata import plugins.

For example, Apple Pages and the original iBooks Author apps have .mdimporter plugins. The same can be said for some Microsoft 365 apps. Other apps provide them as well.

You can write your own .mdimporter plugins in Apple's Xcode, place them in the /Library folder, and Spotlight will use them to import metadata from files supported by your apps.

.mdimporter plugins are essentially collections of code and information that tell Spotlight what kinds of metadata can be imported and how to access that data. By using a custom .mdimporter, you can allow your app to store additional metadata and make it available to Spotlight for indexing.

Apple also has a (slightly older) developer document called the Spotlight Importer Programming Guide that shows how to write .mdimporter.

.mdimporter Spotlight plugin.

Spotlight metadata utilities

Apple and third parties also provide several command line tools (CLIs) that you can use in the macOS Terminal app to access Spotlight metadata in file system objects stored on your devices.

Spotlight stores indexed metadata in a local database on each attached disk volume. Spotlight metadata databases are called repositories.

Each store contains indexed metadata for each file system object, as well as some additional data that speeds up Spotlight searches. By storing and updating file metadata in a separate database, Spotlight can search and retrieve data much faster because it doesn't have to traverse the file system hierarchy each time.

On APFS volumes, Spotlight also uses some internal volume metadata in combination with storage metadata for faster and more accurate searches.

There are many Spotlight CLI utility commands, but you'll likely want to use four key ones:

  1. mdutil
  2. mdimport
  3. mdls
  4. mdfind

You can get information about using any of these in Terminal by opening Terminal, then typing man followed by a space, the name of the utility, and pressing Return .on your keyboard.

For example:

man mdutil

Note that some commands require a file system parameter after the command name, and some do not. For example, mdutil doesn't do this, but mdattributes does.

To exit the manual (manual) system in Terminal, press Control-Z on your keyboard.

mdutil

The mdutil command is a simple utility that helps you manage Spotlight metadata stores on your Mac. Note that the volume must be mounted on the desktop in Finder for mdutil to work with it.

For example, with mdutil, you can enable and disable Spotlight stores for specific volumes, disable search on that volume, erase storage for a volume, display Spotlight indexing status for a volume, and much more.

You can also apply specific commands to Spotlight stores on each indexed volume and clear Spotlight store caches to force the store itself to be used directly.

Type man mdutil and press Return on your keyboard in Terminal to fully use mdutil.

mdimport

mdimport is a Spotlight CLI utility that allows you to manually import all searchable metadata from the file system hierarchy into Spotlight. metadata repository. It uses the .mdimporter plugins mentioned above to import and search data.

You can use mdimport to print all metadata elements stored for each indexed element in the file system hierarchy, except for elements stored with the kMDItemTextContent key, since these elements contain the actual text content of the file system elements.

You can also use mdimport to test .mdimporter plugins written by you or your team.

Type man mdimport and press Return on your keyboard in the terminal to fully use mdimport.

mdls

mdls is a utility that lists the metadata attributes for a single file on disk using a predefined metadata key (or “tag “). Apple defines most of the metadata keys used by Spotlight, but if you write your own .mdimporter, you can define your own keys.

Type man mdls and press Return on your keyboard in Terminal to use mdls.

mdfind

mdfind is a flexible and powerful utility that allows you to find all the objects in your file system hierarchy that match your specific metadata. specify – by searching the Spotlight store(s) on a specific volume.

Using various mdfind options, you can start a search at a specific location in the file system hierarchy, specify which metadata elements to match, and specify specific file names to match.

mdfind will only return results for files that match the search criteria you specify.

You can cancel an mdfind search while it is running by typing Control-C on your keyboard.

Mdfind also has an -interpret flag, which allows you to specify a string in natural language, just as if you had entered it into Spotlight in the Finder. mdfind will interpret the string and adjust the search accordingly.

You can also combine mdfind with other standard UNIX utilities, such as grep, to perform complex searches and write the results to standard output, including a file.

Type man mdfind and press Return on your keyboard in Terminal to use mdfind.

There are several additional Spotlight utilities not mentioned here, which we'll cover in a future article.

Attribute Keys

Spotlight and Core Services file metadata work by storing each metadata element in storage with a unique key or string. Each key tells Spotlight and the API which metadata element you're interested in.

Apple defines metadata keys as Core Foundation strings of type CFString, a common Core Foundation string type used in almost all Apple-related software. Using the Core Foundation API, you can also manipulate CFStrings directly from code.

Apple lists most metadata attribute keys in the File Metadata API documentation mentioned above. Most keys begin with the prefix kMD (short for constant – metadata).

To use the File Metadata API, you typically use one of its functions or one of Spotlight's functions and specify a metadata key to indicate which piece of metadata you want to use. Keys can be used both when retrieving and writing metadata.

For example, in Swift, the metadata API key for the “date added” metadata element for any file system object is defined as:

let kMDItemDateAdded: CFString!

Or in Objective-C:

const CFStringRef kMDItemDateAdded;

(In Objective-C, CFStringRef is an opaque Core Foundation type for CFString).

If you are an Apple developer using the File Metadata API, you will often use metadata keys.

AVMetadataItem

For audio/video media files, Apple provides one additional API within AVFoundation.

This happens for several reasons, for example, media metadata typically needs to be loaded asynchronously at runtime to prevent latency during media playback, while some metadata is required by media industry standards. Some laws in various regions also require that owner and author metadata be embedded in media files in certain ways.

The central data type of the Apple metadata item in AVFoundation is called AVMetadataItem. AVFoundation provides various APIs for accessing and writing AVMetadataItem.

There is also a corresponding set of AVMetadataItem attributes (keys) used to access the AVMetadataItem.

Each AVFoundation media resource is defined by an AVAsset data type.

Tracks within each resource are defined by Apple as AVAssetTrack.

Each AVAsset or track can have one or more AVMetadataItems attached to it.

You can create AVAsset objects in code using various AVFoundation APIs, which can load them from a file (such as a QuickTime or .mp3 file) or even from a direct Apple HLS stream.

You should also be familiar with the asynchronous media loading API, implemented as the AVFoundation AVAsynchronousKeyValueLoading protocol.

Once you have an AVAsset or AVAssetTrack object in code, you can manipulate its metadata attributes at will and write them back to the source.

For complete information about AVFoundation assets and tracks, see the developer page Documentation/AVFoundation/Media assets.

For a complete list of all AVFoundation metadata keys, see the developer page Documentation/AVFoundation/Media assets/AVMetadataKey.

AVFoundation is a complex platform, there are hundreds of keys for its API.

At first glance, Spotlight metadata seems like a complex topic, but its API is quite easy to use. The CLI utilities are also simple and easy to understand with a little practice.

Using these tools, you can effortlessly configure and search Spotlight data across all indexed volumes.

Leave a Reply