Deconstructing Social Media Discovery

Jun 27, 2024 10:22:20 AM / by Ryan Short

Americans love social media - about 9% consider themselves to be an influencer! In January 2024, the Pew Research Center reported that 83% of American adults use YouTube, 68% use Facebook, and 47% use Instagram. Most users have accounts on multiple platforms and many actively create content.

That all adds up to a whole lot of potentially discoverable data. 

And social media content is exceptionally varied, from posts to photos and multimedia to links, comments, and all sorts of other data. The non-messaging content types give rise to most of the legal and practical challenges of social media discovery. Social media content such as posts, comments and multi-media files is important in these areas of civil litigation:

  • Family law

  • Personal injury, medical malpractice and workers compensation

  • Employment lawsuits and investigations

  • Copyright and trademark infringement

  • Product liability

Social media is dynamic, interactive and informal. Recognizing the challenges created by social media’s unique characteristics is the first step to overcoming them.

Identification and Preservation Challenges

The dynamic quality of social media heightens concerns regarding identifying and preserving potentially relevant sources of ESI when compared to more traditional sources such as PCs and email. Social media is in a constant state of flux, so the best strategy is to act quickly to mitigate spoliation concerns.

Custodian questionnaires and interviews should include targeted questions about social media. Ask if relevant data exists in any social media site.

The opposing party’s social media may also be at issue, particularly in employment and personal injury cases. There are several legal tools available to learn about the existence of discoverable social media. These include discovery letters following up on initial disclosures, raising social media in the discovery conference and asking about it in interrogatories.

In addition, it can be helpful to perform internet searches of publicly available social media content. These searches may themselves locate relevant data, or they may supply information needed to make a threshold showing of relevance necessary to obtain discovery of non-public data.

Most types of ESI can be defensibly preserved in place with a properly managed litigation hold. Social media is different. Not only is it constantly changing, many of the changes are outside the account holder’s control.

Accordingly, best practice is to make a preservation collection. Typically the entire user account is captured for a preservation collection. A targeted capture of relevant data may be warranted in order to exclude non-relevant, private information. Supplemental preservation collections can be made to capture relevant content created on an ongoing basis.


Data Collection Challenges

One of the striking features of social media is the many different data types found in one account or even one post. The interactive nature of social media creates content that is dissimilar, complex and stored in multiple locations. This diversity of content creates technical hurdles to collection.

One solution for meeting these collection challenges is to partner with a digital forensics service provider. A qualified provider will provide competent advice on technology issues and options.

In addition, providers use forensic software developed specifically for social media collections. Forensic software has several advantages:

  1. It’s specifically designed to collect all types of social media content, including multimedia files and metadata. The date and time that a social media post was created may be as important as the content.

  2. Forensic software has the capability to follow links and collect the linked content, even if it is found on a third-party website.

  3. Social media forensic collection software uses hash value verification, the industry standard means of validating that the collection is a complete and accurate copy of the original.

There are two self-collection alternatives to using forensic software. The obvious advantage of self-collection is its modest price tag. But be aware that self-collection has significant limitations that often outweigh the immediate cost-savings.

First, some platforms, including Facebook and Twitter, offer a “download my account” feature that permits account holders to request downloadable versions of their accounts from the social media provider. A user (or authorized proxy) must be logged in to request a download.

Second, screen captures are an alternative when the platform doesn’t offer a “download my account” option or the party making the collection doesn’t have login access to the account. If you take screen captures, use a vetted capture program with a timestamp feature.


Review Challenges

The informality of social media requires a different approach to analysis and review than standard ESI. Standard eDiscovery tools and workflows are designed for business communications, which are often characterized by complete sentences, proper grammar and correct spelling and punctuation. It uses a small universe of widely recognized symbols, most found on the qwerty keyboard.

Social media is unlike business documents in pretty much every way because it is characterized by:

  • Very short text

  • Misspellings

  • Acronyms, abbreviations, and slang

  • Emoji

  • Multimedia

  • Links

To make matters even more challenging, acronyms, abbreviations, and emoji are continually developing, carry multiple meanings, and are used to mean different things by different people. Complicating the situation even more, emoji are platform-dependent, meaning the same emoji may look different when viewed on two different devices.

While keyword searching is many lawyers’ default starting point for eDiscovery review, it’s not necessarily a viable option with social media. As an initial matter, electronic files are indexed during eDiscovery processing to make them text searchable. There is a threshold text requirement for indexing. Some social media content does not have enough text to meet this requirement.

Moreover, it's difficult to formulate effective keyword searches for social media. The text content of social media simply does not lend itself to word searches. Many social media posts replace words with graphics, symbols or links. Non-standard English is commonplace.

So long as these significant limitations are recognized, keyword searches can still be very useful. An example is names of individuals and companies, which are important search terms in most matters. Unlike other social media content, names are likely to be written out in full and spelled correctly.

Another useful standard review technique is filtering. Filtering is a simple technique to create a review set based on criteria such as date range and file type. For instance, photos are critical social media evidence in many employment and personal injury cases. Filtering is a quick and easy way to focus review on the file types typically created by digital cameras and smartphone cameras.

Finally, ask your partner about whether converting data to RSMF (Relativity Short Message Format) is feasible. Your brain is trained to process data from social media platforms as you see it in its native form. Rendering it in a near-native way can positively impact the speed and accuracy of review.



Social media is on the leading edge of eDiscovery and review strategy is still developing. The greatest challenge may be adopting a mindset that words are relatively unimportant. The informal, non-text content of social media requires creative approaches to analysis and review. In many instances, it will also require old school linear (i.e., record-by-record) review.

Social media’s distinctive characteristics create practical challenges in discovery from identification through review. Successfully navigating this shifting and unfamiliar landscape demands an understanding of how social media is different from traditional data sources and the eDiscovery implications of those differences.

Read more about all types of short message data in my new guide, "Mastering Short Message Data".

Tags: Data Collection, Data Preservation, eDiscovery, ESI, Digital Forensics

Ryan Short

Written by Ryan Short

Ryan joined Proteus in 2020. He is an MBA and a Certified eDiscovery Specialist with over a decade of experience in publicly traded, PE-backed, and bootstrapped entities focused on technology-enabled services. Ryan lives in Indianapolis with his wife and their 5 children under the age of 9. Consequently, his wife won't let him buy a dog.