NYU Programming Job: Papyrological Navigator

New York University: Programmer/Analyst (7421BR)

New York University’s Division of the Libraries seeks a Programmer/Analyst to work on the "Papyrological Navigator" (http://papyri.info), a major web-based research portal that provides scholars worldwide with access to texts, transcriptions, images and metadata related to ancient texts on papyri, pottery fragments and other material. The incumbent will work closely with the Project Coordinator (at Columbia University) and with scholars involved in the project at NYU's Institute for the Study of the Ancient World, Duke University and the University of Heidelberg, as well as with NYU Digital Library Technology staff.

The incumbent's initial responsibilities will include: migrating existing PN software applications from Columbia University to NYU; optimizing performance as needed; establishing a robust production environment at NYU for the ongoing ingest and processing of new and updated Greek text transcriptions, metadata and digital images; performing both analysis and programming of any required changes or enhancements to current PN applications.

This is a grant-funded position and is available for 2 years.

Candidates should have the following skills:

  • Bachelor's degree in computer or information science and 3 years of relevant experience or equivalent combination
  • Must include experience developing applications using Java
  • Demonstrated knowledge of Java, Tomcat, Saxon, Lucene, Apache, SQL, XML, XSLT
  • Experience with metadata standards (e.g. TEI, EpiDoc)
  • Experience working in a Unix/Linux environments
  • Preferred: Experience with image serving software (eRez/FSI), Java Portlets, Apache Jetspeed-2, and Velocity templates.
  • Preferred: Experience designing, building, and deploying distributed systems.
  • Preferred: Experience working with non-Roman Unicode-based textual data (esp. Greek)
  • Excellent communication and analytical skills

Applicants should submit resume and cover letter, which reflects how applicant’s education and experience match the job requirements.

Please apply through NYU's application management system: www.nyu.edu/hr/jobs/apply.

At this page click on "External Applicants" then "Search Openings." Type 7421BR in the "Keyword Search" field and select search. NYU offers a generous benefit package including 22 days of vacation annually. NYU is an Equal Opportunity/Affirmative Action Employer.

New York University Libraries: Library facilities at New York University serve the school’s 40,000 students and faculty and contain more than 4 million volumes. New York University is a member of the Association of Research Libraries, the Research Libraries Group, the Digital Library Federation; serves as the administrative headquarters of the Research Library Association of South Manhattan, a consortium that includes three academic institutions. The Library’s website URL is http://library.nyu.edu

BAtlas IDs: Maps 10-13, 20-21, 49

README file for Barrington Atlas Identifiers, version published 2008-08-05
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 10, 11, 12, 13, 20, 21, 49
List of all maps presently covered: 10, 11, 12, 13, 20, 21, 22, 23, 35, 36, 37, 38, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 65, 72, 73, 74, 75, 76, 77, 78, 79, 80, 84, 85, 86, 87, 87 inset, 88

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.

* No changes to previously released IDs.

Pondering Change to Atlantides Aggregators: Excavation Blogs

The subscription list for Maia Atlantis is getting pretty huge. In a recent post, Bill Caraher reminded me that there's a big (and growing) genre of excavation blogs. I think this genre is heavily underrepresented in the Atlantides feed aggregator constellation.

It occurred to me that it might be worthwhile to put dig-specific blogs into their own aggregator, and pull the few currently in Maia out and put them there too.

On the up side, that might help keep Maia to a manageable size. On the down side it would mean splitting up what has, until now, been a one-stop shop for ancient world blog content. And there would inevitably be some blogs in which lots of interesting non-excavation posts appear alongside hard-core dig news and status.

Thoughts?

BAtlas ID update: maps 23, 84, 85, 87, 87 inset, 88 and fixed dates

README file for Barrington Atlas Identifiers, version published 2008-08-04
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 23, 84, 85, 87, 87 inset, 88
List of all maps presently covered: 22, 23, 35, 36, 37, 38, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 65, 72, 73, 74, 75, 76, 77, 78, 79, 80, 84, 85, 86, 87, 87 inset, 88

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.

  • All readme files, dated folders and compressed tar files have been modified and renamed as necessary to redress the erroneous substitution of 2007 for 2008. No changes to IDs have occurred.

BMCR and SAFE Events Feeds Pulled from Maia

It is with regret that this morning I have pulled the Bryn Mawr Classical Review (BMCR) Most Recent Articles feed from the list of feeds aggregated by Maia Atlantis. I took this step in accordance with my own Atlantis Suppression Policy.

It would appear that every time the BMCR adds a new article, dates on all articles in the feed are updated to present. As of this writing every single entry contains an identical "pubdate" tag with the value "03:49:18, Sunday, 03 August 2008" even though some of the entries have been in the list since it was first deployed a few weeks ago. This is non-standard behavior, and has the effect of pushing all the BMCR entries, in a block, to the top of any feed reader or aggregator, ahead of other content that is actually new. And in most feed readers they will show up highlighted or bolded, to indicate "new content." The appropriate behavior is to adjust dates only on those entries that have been added or substantially changed.

The Saving Antiquities for Everyone (SAFE) Events Feed is also blocked because it is forward-dating announcements of events to the date of the event, rather than the date of the entry. For example, the current feed contains a single entry with the following pubdate: "Thu, 16 Oct 2008 07:00:00 EST". This is the event date, not the publication date of the feed entry. This is also abuse of feed entry date fields and has the effect of causing these entries to linger at the top of the aggregation list for weeks or months until the date of the event passes.

I will be contacting the editors of both resources in the hopes of resolving these technical difficulties so that their content can once again be featured in Maia Atlantis.

Hidden Web: Don't Love It, Leave It

There's been a bit of buzz lately about Google's "failure" to effectively search the "hidden (deep) web". In the discussions I've been seeing, the hidden web is equated with stuff in academic and digital library repositories, i.e., "OAI-based resources" (which I assume to mean OAI/PMH).

I have to say: repositories != hidden web. The hidden web is simply the stuff the search engines don't find. Systems that surface information about their content only through OAI/PMH interfaces might make up a small part of the hidden web because they're not being surfaced to the bots, but frankly the hidden web holds way more stuff than what's in Fedora and DSpace at universities. Just ask Wikipedia.

The assertion that repository content == the hidden web is circular and false rhetoric that obscures the real problem: people are fighting the web instead of working with it. If you fight it, it will ignore you. This sort of thinking also makes hay for enterprises like the Internet Search Environment Number that seem to me to be trying to carve out business models that exploit, perpetuate and promote the cloistering of content and the rationing of information discovery.

Yesterday, Peter Millington posted what's effectively the antidote on the JISC-REPOSITORIES list (cross-posted to other lists). I reproduce it here in full because it's good advice not just for repositories but for anybody who is putting complex collections of content on the web and wants that content to be discoverable and useful:

Ways to snatch defeat from the jaws of victory
Peter Millington
SHERPA Technical Development Officer
University of Nottingham

You may have set up your repository and filled it with interesting papers, but it is still possible to screw things up technically so that search engines and harvesters cannot index your material. Here are seven common gotchas spotted by SHERPA:
  1. Require all visitors to have a username and password
  2. Do not have a 'Browse' interface with hyperlinks between pages
  3. Set a 'robots.txt' file and/or use 'robots' meta tags in HTML headers that prevent search engine crawling
  4. Restrict access to embargoed and/or other (selected) full texts
  5. Accept poor quality or restrictive PDF files
  6. Hide your OAI Base URL
  7. Have awkward URLs
Full explanations and some solutions are given at: http://www.sherpa.ac.uk/documents/ways-to-screw-up.html

If you know of any other ways in which things may go awry, please contact us and we will consider adding them to the list.
I'm happy to say: Pleiades gets a clean bill of health if we count nos. 5 and 6 as non-applicable (since we're not a repository per se and we don't have a compelling use case for OAI/PMH or PDF).

Disclaimer: we are exploring the use of OAI/ORE through our Concordia project. One of the things we like most about it is that its primary serialization format is Atom, which is already indexed by the big search engines. With the web.

Can you believe it?

So, how long was it before the Mulder/Hades/Orpheus nexus dropped on your head like an anvil from Zeus? It was the dog that put me over the edge.

Outfox Shoutout

I wanted to blog about this as soon as it hit my feedreader, but then there was that proposal to finish. Anyway:

One of the highlights of a decade spent at Carolina was getting to work with Gary Bishop, a professor in the Department of Computer Science. We found ourselves in a collaboration initiated by Jason Morris, a blind Classics graduate student who was deeply interested in ancient geography and for whom Braille maps constituted a ridiculously low-bandwidth, low-resolution disappointment. The idea of producing immersive spatial audio maps took off in the hands of a group of Gary's undergraduate students and, with some seed money from Microsoft Research, this one initiative blossomed into a research and teaching program in assistive technology.

Gary's recently blogged about a cool new project: the Outfox extension for Firefox, which:

allows in-page JavaScript to access local platform services and devices such as text-to-speech synthesis, sound playback and game controllers
It's open source (BSD License), and you can help.

BAtlas IDs: 4 more sets in Asia Minor, plus Cyprus

README file for Barrington Atlas Identifiers, version published 2007-07-26
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 62, 63, 66, 72, 86
List of all maps presently covered: 22, 35, 36, 37, 38, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 65, 72, 73, 74, 75, 76, 77, 78, 79, 80, 86

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.

* No changes to previously issued files in this release

Feed me, Seymour

It will come as no surprise to the legions of loyal readers here that I'm giving a hearty +1 to David Meadow's call for more antiquity-oriented websites to highlight and alert us to changes by incorporating web feeds.