The University of Maryland Data Breach: Insights and Questions

Wow, that brings back memories!

Wow, that brings back memories!

Gabriel J. Michael / gmichael at gwu dot edu

Yesterday evening, the University of Maryland announced that it had suffered a data breach exposing personally identifiable information of approximately 309,079 individuals, including current and former students, current and former faculty and staff, administration, and affiliated personnel.

Specifically, the University is reporting that attackers were able to copy a database containing the full names, Social Security Numbers (SSNs), birth dates, and University ID numbers (UIDs) of everyone issued a university ID since 1998.

The University’s response so far has been reasonable. The breach is thought to have occurred early Tuesday morning, and the University began notifying the public Wednesday evening, less than 48 hours later. Outside investigators are being brought in to examine what exactly happened.

However, there remain several serious issues that must be addressed by the University within the coming days and weeks to ensure an adequate and appropriate response to this incident. As an alumnus with some knowledge of the University’s security practices in the past, here are my thoughts.

The University Must Immediately Address Related Security Issues

I attended the University of Maryland, College Park, between 2003 and 2007. This morning, I logged into the Testudo registrar system using my Student ID number (SID) and PIN. As the Testudo website helpfully notes, your “Student ID, in most cases, will be your Social Security Number” and “Your PIN is originally set to your six (6) digit birth date (e.g. mmddyy or 012585).

In my case, both these statements were accurate, and I was able to log in and access the transcript request service using my SSN and birth date. I suspect this is the case for large numbers of alumni, if not current students. I vaguely remember changing this PIN while I attended the University, but even if I wanted to now, there is no obvious way to do so.

This means that the attackers (or more likely, anyone they have sold the data to) can currently request academic transcripts for any current or former student. Apart from the actual breach itself, this is almost certainly a separate FERPA violation. Unscrupulous individuals could use these transcripts in a variety of interesting ways.

If it has not already done so, the University should immediately begin monitoring the transcript request service and other related services that rely on the SID/PIN combination to prevent fraudulent access, and perhaps impose additional informational requirements before granting access to these services. The University should also advise affected individuals how they may change their PIN.

One Year of Free Credit Monitoring is Inadequate

As part of its response, the University has committed to providing “one year of free credit monitoring” through an as-yet unspecified company to those affected. While better than nothing, this is inadequate. Unlike data breaches involving credit or debit card numbers, which can be reissued, this breach released SSNs which are extremely difficult to change.

I was affected by the recent Target data breach, which exposed my credit card number to the attackers. However, my credit card company is issuing a new card with a new number, making the old card information useless. In contrast, the information obtained from the UMD data breach will remain valuable for decades to come. In fact, it will become even more valuable in a year when the free credit monitoring has expired.

The University Must Disclose the Technical Details of What Happened

So far, we know virtually nothing about the technical details of the attack. Given that the attack is still under investigation, this is understandable. However, when the investigation is complete, or at least when enough detail has been uncovered, the University must disclose the technical details of what exactly happened. There are several reasons why such disclosure is important.

First, it provides a valuable public service. If other organizations rely on similar security measures or software packages, they should have the opportunity to address the security flaws that affected the University. By not releasing details of the attack, the University would be ensuring that such attacks could be repeated in the future against other organizations.

Second, it allows us to verify University statements. The University is currently describing this incident as “sophisticated computer security attack.” The Diamondback is calling it a “massive cyberattack.” The University’s Chief Information Officer is saying ‘The hacker or hackers must have had a “very significant understanding” of how the school’s data are designed and protected… in contrast with typical attacks,’ claiming “These people picked through several locks to get to this data.”

Perhaps these statements are true, but given my previous experience with UMD’s security practices (discussed below), I have my doubts. The Target breach was a relatively sophisticated attack. Without any technical details, we don’t know if the UMD breach was due to carelessness, negligence, an honest mistake, or whether appropriate measures were actually in place and were simply defeated.

Third, by releasing the technical details of the attack, the University will be forced to discuss how they have responded to the attack to prevent its recurrence. Knowledgeable members of the public can then assess the response to see if it is adequate. In the past, the University has responded to security problems with a rushed and inappropriate response, creating further problems down the road (discussed below).

Could the Breach Have Been Easily Avoided?

Without technical details, there is no way to answer this question with any certainty. However, my previous experience with security practices at UMD gives me pause.

I attended UMD between 2003 and 2007. Between at least 2003 and 2006 (and probably earlier), the University used students’ SSN as their primary identifier. In order to get a transcript or interact with the registrar, you provided your SSN. If you lost your ID card, or forgot it when you went to the gym, a helpful student worker would ask, “What’s your soc?” (pronounced “sōsh”). Several times I remember having to write my SSN on the cover page of academic documents.

The SSN was also stored on the magnetic stripe of every students’ University ID card, although the number was not printed anywhere on the card.

This was a bad practice, and eventually the University began to transition from using SSNs to University ID numbers (UIDs). It also re-issued ID cards to the entire university population. This latter decision was at least in part prompted by the work of a group of students who were studying the University’s security and access protocols. I was informally involved with this group. For a detailed overview of their work (minus some redactions), read this paper.

Unfortunately, in its haste remove SSNs from ID cards and despite being warned about the problems it could cause, the University made the poor decision to replace the SSN with the UID. This presented a problem, since the UID was publicly accessible for the entire student/faculty/staff population on an LDAP directory server. This meant that a malicious individual could look up the UID of any individual, and create an ID card that would allow them physical access to any location the individual could normally access.

The university eventually rectified this mistake, if I recall correctly, by re-encoding the recently re-issued ID cards with a meaningless identifying number that was not publicly accessible, as they should have done in the first place.

I could go into much more detail about the above, but it is mostly technical and not necessarily related to the data breach. Along the same lines, when students discovered that the University was storing location access information from their ID cards, and even using this information in police investigations, the University initially denied this was the case.

I hope UMD does better this time. Perhaps there have been significant changes in the past seven years. But I’ll note that the transition away from SSNs was approved in back in 2005, and here we are, 9 years later, facing this breach.

Organizations Must Take Steps to Limit the Collection and Retention of Unnecessary Data

In closing, this incident highlights the danger of the collection and retention of unnecessary data. Note that I am not saying that this particular database should not have included the information it did. There are many valid reasons for the University to have a database with this kind of information; e.g., alumni from years past need to be able to access records, and the school needs a way of identifying that they are who they say they are. (Technically this can be accomplished without storing the actual SSN, but see the addendum below for why this approach might not work).

However, I think this kind of incident should lead us to think carefully before we assemble large databases of information that we do not necessarily need. For example, many states and localities are using license plate readers to collect location and timing information of cars. Very few police departments using this technology have developed rules or guidelines about who can access the data, how long it will be retained, for what purposes it can be used, and with what other organizations it can be shared. These are the sorts of things that should be thought about before collection begins, and not after a breach has occurred.

Addendum: One Facebook commenter asked why UMD couldn’t have simply used a hash function to avoid storing the SSN at all. Obviously it wouldn’t work for current students, since they need to issue W-2s, 1098-Ts and other tax documents, etc., that include the SSN, but why not for alumni?

There’s probably no good answer for why this wasn’t done, but there are probably many bad answers. E.g., there are probably reporting requirements to the state, IRS, law enforcement, etc. that might require the university to produce the SSNs of former students. Also, even if they did hash the SSNs, an attacker could easily brute force the relatively limited number of SSNs (9 digits, so 1 million combinations without considering rules which significantly reduce the search space) for each student unless the hashes had been salted, etc.. Now maybe that problem could have been solved by encryption rather than hashing. But given that this database might have been structured in 1998 or even earlier, it’s possible no one was thinking along high security lines back then.

Advertisements

About Gabriel

Ph.D. in political science. Postdoc and resident fellow at Yale Law School's Information Society Project. Tech geek. Mechanically inclined. I study the politics of intellectual property.
This entry was posted in General and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s