Technique of Scrap:
- Scrapers are popular tools for extracting information from college Web sites, but they're fragile; they can slow down loading times for other visitors if they are programmed badly. And when colleges redesign their sites, many scrapers stop working.
- Realizing that students are already digging for data on their sites, some institutions are creating their own platforms to satisfy the demand.
Project it in Harvard first:
This year Harvard University welcomed a contractor who joined its information-technology staff in an effort to create live data feeds from things like course catalogs, for use in computer-science programming projects. The team calls its work "data wrangling"—a not-so-subtle hint at the difficulty of coaxing entrenched software systems into providing data that easily plug into applications.
In the past, says Katie L. Vale, director of academic technology at Harvard's Faculty of Arts and Sciences, computer-science courses had to use "dummy" data feeds, which only mimicked real information.
The most important of all the updating of data were missing. "The data was coming from really old, crusty systems, and it was not in a state that would make it easy for us to get a data feed that the students could use" for their programming projects, she says.
Although the long-term goal of Harvard's effort is to give students a Web portal where they can access data on courses, shuttle schedules, and dining-hall menus, among other things, says Ian T. Wall, the university's associate director of enterprise data and business-intelligence services. Students have even asked him about creating real-time data for campus laundry facilities so they can tell when washing machines are empty.
At the University of Waterloo,
"There was a bit of reluctance, because people often said, 'That's our data, we can't just release that'," he says. "But we made the argument that students are grabbing it anyway, and probably screwing it up and making mistakes, so the best thing we can do is give them some good, clean data with a bit of control."
- Any information that's publicly available and not hidden behind a password-protected screen is fair game for developers, he adds.
- So far, course and exam-schedule information is available, and there are plans to extend the platform.
The goal is to empower the university's network of student programmers. "They look at some of our systems and say, 'That's terrible. I could do a better job,'". "So we say, 'Here's the data; see if you can.'"
A laundry app may sound like a trivial use of university data, but entrepreneurs believe they can use other kinds of information to transform the college experience—like an application that helps students select which college to attend on the basis of their career goals. Or a degree-planning tool that could help students graduate with less debt.