Two Legal Publishing Aquisitions Announced

Today LexisNexis announced via email it is acquiring Knowledge Mosaic, a Seattle-based publisher that specializes in federal regulatory and disclosure information, and Thomson Reuters announced it has entered into an agreement to acquire Practical Law Company, a UK-based legal publishing company which targets transactional lawyers.

images

KMHead

Deleting the Law

Back in May, Christine Kirchberger posted an interesting quote from 1968 relating to the growth in the size of the law. Ms. Kirchberger goes on to briefly argue that perhaps a formal system of identifying and deleting “non-relevant legislation and case-law” could help improve the performance of legal information retrieval (IR) systems (i.e., something akin to the delete movement in the area of privacy regulation).

Although I understand the frustration of dealing with an ever-growing mountain of data, I think the solution to this challenge is in improvement of IR technology, not forcibly reducing the amount of content to be indexed. Further, the assumption that non-current legal information would be excluded from IR systems is simply wrong. In a common law system especially, case law is never really non-relevant no matter how much time has passed without it being cited or referenced. In addition, there are numerous research scenarios in which historical information (what the law used to be at a particular time in the past) is the goal (e.g., auditing, litigation over past actions, etc.). While I admit many laws could be simplified or reduced in size, much of the growth in the law is more likely due to an increasingly complex society and the incremental way in which the law grows.

Conference on Internet Privacy, Social Networks and Data Aggregation (Mar. 23, 2012)

I recently attended the Conference on Internet Privacy, Social Networks, and Data Aggregation which was held at my old law school, Illinois Institute of Technology (IIT), Chicago-Kent College of Law. The conference was hosted by the Center for Information, Society, and Policy on Friday, March 23, 2012. There were a number of interesting speakers some of which I have listed below (see the complete conference agenda) with some of my thoughts on the respective issues they covered.

Continue reading “Conference on Internet Privacy, Social Networks and Data Aggregation (Mar. 23, 2012)” »

Data Science Chicago Meetup (Mar. 22, 2012)

Today, I attended presentation about government data hosted by Data Science Chicago, a Chicago-based meetup group. The presentation was interesting both because of the personal background of the speaker, Brett Goldstein, as well as the number of interesting projects that were discussed that are using open government data. The speaker was the former IT director for OpenTable before joining the Chicago police department.

During his presentation he explained how his role as a police officer led to founding the Chicago Police Department’s Predictive Analytics Group, an effort to use patterns in incident-level crime data to predict future incidents of crime. According to Mr. Goldstein, the group’s predictions were able to focus police patrols on 1-2% of the city (down to the census block level) in which murders or other violent crimes were likely to occur.

Mr. Goldstein is now the City of Chicago’s Chief Data Officer and, at the event, he talked about the city’s effort to make government data open to the public and a number of projects using that data. According to Mr. Goldstein, the city’s data portal has already released the incident-level crime data going back 10 years — the biggest such collection of open data in the world. His more recent efforts have focused on using MongoDB for spatially-focused time series data and the release of the city’s 311 data. The speaker also touched on a number of related topics, including the use of regression analysis, the treatment effect, the need for more useful geographical boundaries other than census blocks, and advice for aspiring data scientists on the skills needed to be effective.

Overall, an interesting presentation and it made me want to take a closer look at the data sets available through these government portals. While I was already familiar with data.gov for federal level data, I was surprised to find so much data available at my city, county, and state level.

Natural Language Processsing Course

I signed up for a free online class taught by two Stanford University professors, Dan Jurafsky and Chris Manning, on natural language processing offered through Coursera back in December. While it was originally set to start in January it was delayed and will now begin March 12, 2012 (registration is still open I believe). I’ll likely post about some of the topics covered in the class, especially how they may relate to applications using legal content.