A federal judge unsealed the source code for a software program that was used to compare DNA samples in New York City’s crime lab.
In July 2016, Judge Valerie Caproni of the Southern District of New York determined in U.S. v. Johnson that the source code of the Forensic Statistical Tool, a genotyping software, “is ‘relevant … [and] admissible’” at least during a Daubert hearing—a pretrial hearing where the admissibility of expert testimony is challenged. Caproni provided a protective order at that time.
This week, Caproni lifted that order after the investigative journalism organization ProPublica filed a motion arguing that there was a public interest in the code. ProPublica has since posted the code to the website GitHub.
Probabilistic genotyping is used when comparing complex DNA samples, like mixtures. It does not define a DNA sample itself; rather, it is an interpretive software that runs multiple scenarios, like risk analysis tools used in finance, to analyze the sample. This contrasts with traditional DNA analysis, which analyzes whether a DNA type is present or absent.
While there are numerous public and proprietary genotyping software programs, ProPublica reports that the “FST was used to analyze crime-scene evidence in about 1,350 cases over about 5½ years.” The FST was phased out at the beginning of this year with the adoption of different software.
The debate over the use of algorithms in criminal proceedings has heated up in the past couple of years. Last year, the Wisconsin Supreme Court ruled against a defendant seeking access to an algorithm that deemed him a public safety risk. The Indiana Supreme Court ruled (PDF) similarly in 2010.
Regarding genotyping software, courts have been more mixed.
In California, People v. Chubbs, a cold murder case from the 1970s, brought this issue to light in 2012. The prosecution used evidence from TrueAllele, genotyping software from for-profit company Cybergenetics. The TrueAllele evidence stated that DNA found on the victim and Chubbs, who is black, was “1.62 quintillion times more probable than a coincidental match to an unrelated black person.” The trial court determined that Chubbs was entitled to examine the source code under protective order.
This decision was overturned on appeal in 2015. The appeals court said that Chubbs’ stated reasons to access the source code, even under protective order, did not outweigh trade secret protections. Further, as the court writes, “access to TrueAllele’s source code is not necessary to judge the software’s reliability,” because validation studies and expert testimony are sufficient to make that determination.
Back in New York, state Judge Mark Dwyer has kept FST evidence out of court at least four times, according to ProPublica. In a Manhattan court on Oct. 16, Dwyer said it is because he has doubts about the program’s acceptance in the scientific community, especially after New York discontinued its use.
Judge Caproni’s decision to release the source code was not the only algorithmic transparency event to happen in New York this week.
A public hearing was held on Monday regarding a New York City Council bill, thought to be the first of its kind in the U.S. by its author, that would require the public posting of algorithms used by New York City government agencies.
Don Sunderland, the deputy commissioner for Enterprise and Solution Architecture at the New York City Department of Information Technology & Telecommunications, was concerned about the workability and security issues of such a broad piece of legislation, according to Civicist, a civic technology news blog.
Legal and civil rights organizations like The Bronx Defenders, The Brennan Center for Justice at NYU Law School and the New York Civil Liberties Union all had representatives testify in support of the bill.
The future of the bill is not known, but its sponsor, James Vacca, told ProPublica that he would revise the legislation based on the testimony he heard during the hearing regarding confidentiality concerns.