28 maart 2007


ISI Web of Knowledge consortium day in the Netherlands

Thomson Scientific paid a long overdue to visit to their Dutch library customers today. In quite a posh hotel in Utrecht they had organized a whole day meeting in cooperation with the consortium of Dutch University Libraries (UKB). A whole day! Was that really needed? Well for somebody who uses WoS on a nearly daily basis, it was a little bit over the top. However for some attendees from some colleges of higher education it was quite a good grounder in to the basics of WoS.
Along the way we picked up some interesting bits and pieces as well. Such as the fact that the coverage of journals by all three ISI databases has increased from 8700 scholarly journals to 9200 journals. Despite the extensive explanation of journal selection by editorial committees it was admitted that the number of French and Spanish language journals increased under pressure from library consortia in those geographical areas. Let's assume that they included only the top journals in those languages.
The unique author identification aids which have been available since the end of last year has now finally moved beyond the authors listed in ISIhighlycited only. Apparently this is now available for some 180,000 unique authors that have collected at least more than 1000 citations each. For the science commons we have to wait a little longer.
We also heard that the term "correcting for self citations" in the fairly recent (and impressive) citation reports is in fact a little misleading. Since it only corrects for the citations from the journals in the original search results. Should have worked that out for myself before, but it reminds me of the discussions we had with Elsevier on Scopus in Utrecht last year.
It was a pity that the basic grounder on general searches took so long, that we hardly covered cited references searches. It passed the screen a few times though. Interesting to not that the marketers from ISI still talked about the citation look up results, and then always wanted to loop up the full search of citing references. I mentioned to them that many researchers are actually only interested in the citation lookup results and want to have a simple and direct export function from those results into, for instance Excel. I was under the impression that my arguments didn't make a big impression really. So we will continue for some time with our users complaining about the difficulty of downloading the cited reference search look-up results in to some other software (excel preferably).
Another nice one, well hidden in the depths of WoS, is the possibility of RSS feeds in addition to the e-mail alerts. With the latter I was familiar with. But at a certain point my vanity searches expired and I didn’t bother to extend anymore. The nice thing about the RSS alerts is that they don't expire. You need to register with ISI though, I really wonder how many users have profiles on ISI. ISI couldn't tell. I wonder if that is included in the usage reports. Something to check at a later date.
In the afternoon there was some more attention for EndNoteWeb. What really amazed me is that there was no EndNoteX account required. Which somehow was the impression that I had from all their advertisements. My neighbour was under that impression as well, so I wasn't the only one. But apparently we can make an EndNote Web account because the university has an WoS license. Interesting to hear from the audience all kind of little problems that I experienced myself as well. Toolbar configuration problems, login in to some external databases (which was later confirmed as an existing bug). Well, personally I can't get really serious about EndNote Web, but perhaps useful for beginning users. I will grill it more thoroughly in the future though.
The best was saved for the end. We got a look at some mock-ups for the major overhaul of Web of Knowledge that is planned for July this year. The colours are army green and soft yellow. The main pages focus on cross-search, and simplified boxes on the first screens. The refine search options will move from the top to the left, and some refine options are shown more clearly (more like Scopus?) but still offer more options to refine than Scopus does at this moment. The busy menu that appears on the right hand side of the screen is either much quieter or disappears. Can't remember exactly anymore. Cited reference searches are still similar to what they are at the moment. They are not going to improve their indexing, they are not going to correct citations when the mistakes are obvious. In a few years time you have to remember that author names were once only 15 characters long, then 18, than included diacriticals and spaces, and at some point in the future will include first names on some occasions as well. I really wonder if you change indexing policies, you shouldn’t try to correct as much as possible the repercussions of this change for your historical data as well.
And in the end they will still carry the brand ISI, and the databases SCI, SSCI and A&H but that is for historical reasons only.

Update Ecobibl was er ook en heeft een verslag geplaatst.

02 maart 2007


Open letter to David M. Leslie Jr. and Meredith J. Hamilton

Dear David and Meredith

I just read with interest your article on standardized citation styles in Serials Review. I can't agree more with your article. You addressed the issue from the time spend on writing and correcting reference lists in and for journal articles.
Another compelling argument however, is missed impact because of erroneous citation scanning by institutes like Thomson Scientific (ISI) when they capture references for their Web of Science. And recently Elsevier, they have to do a similar job for their Scopus database. The scanning programs do a fair job, but errors do occur. We all know by looking at the cited reference search results lists in the Web of Science. These errors are partly caused by the many different instructions to authors stipulated by the thousands scholarly journals out there. Errors in citation data is missed impact, is reduced chances of promotion or scholarship etc....
The entry of Elsevier in the arena of citation data is therefore interesting. On the one hand they have to recognize all the different reference styles because they publish electronic journals and want to link out to the full text wherever possible, secondly they want to capture citation data for their Scopus database. As a publishers of some 1800 different titles, with probably about 1800 different instructions to authors the are the most influential party to take steps on your idea on standardizing these rules.
Another interest I have in this matter, as a subject librarian we train students and staff to use EndNote. EndNote X comes with some 2,500 different journal styles, whereas we as a library subscribe to some 10,000 different titles. Chances are small that an EndNote style is already available for a specific journal. Of course you can compose your own styles. We do that quite often. But it is a frustrating experience. Instructions to authors often diverge from the actual reference list in the journal, they are often incomplete. And indeed, they don't match the modern metadata standards.

Yours sincerely
Wouter Gerritsma

PS, I will post this on my blog (http://www.wowter.nl/blog) so Elsevier can read it as well.

David M. Leslie Jr. and Meredith J. Hamilton, (2007). A Plea for a Common Citation Format in Scientific Serials, Serials Review, 33(1): 1-3.
http://dx.doi.org/10.1016/j.serrev.2006.11.009 (Subscription required)

In humour: Serials review is an Elsevier imprint.

24 februari 2007


Sometimes lists can drive you crazy

Currently I am working on a citation analysis job. Reviewing quite a number of researchers and an even larger list of publications. Our trick is that we make a comparison of the citation data extracted from Web of Science with the Baselines found in the Essential Indicators (ESI). Both are databases from Thomson Scientific. Not too much work you might think. Two databases, one is derived from the other, made by the same company.
Well, in theory no sweat.
When you try to work out one or two articles you can run already into some little annoyances, when you one to look-up thousands of journals ISI can drive you mad.
Once you have established that researcher x has published an article in the American Heart Journal and found y citations. The next step is that you look up this journal in ESI. You have to establish in which field the journal is categorized according to ESI. In ESI you have to look this up using the journal abbreviations, quite simple the abbreviation of this journal is AMER Heart J. Slightly odd since this journal is abbreviated in the Journal Citation Report as the AM Heart J. But a another article in the American Journal of Critical Care should be abbreviated as AMER j crit care in ESI. Similar happens with Advances in Advances in Atmospheric Science and Advances in Ecological research. In the first instance you should abbreviate Advances as Adv and in the second instance as Advan. These are mere two examples, doing this manually you run in hundreds of examples.
Ok, be smart don't do it manually. Let's automate. At In-Cites there is a list with all journal categories available. Really nice of Thomson to list a really handy help tool outside the product itself (Yes there is a help file with journal abbreviations available in ESI, but you can't search that list directly, you have to browse, and heck they miss the journal categories in that help file altogether)
Working with the list at In-Cites isn’t a real joy either. Have for instance a look at Abacus, that journal is listed twice at the In-Cites list. Not too much of a problem you might think. But when you want to use a database to make lookups of journal categories and baseline data a bit less labour intensive the best way is to use ISSN to couple the various tables.
Sounds simple. Use the table with all journal categories from In-Cites and match that on the full title against the Journal Masterlist of ISI where they have the ISSN listed as well. Soon you find out that the AUTRALIAN JOURNAL OF GRAPE AND WINE RESEARCH from In cites doesn't match with the AUSTRALIAN JOURNAL OF GRAPE AND WINE RESEARCH from the Masterlist because a stupid spelling error. Or the A N Z JOURNAL OF SURGERY doesn't match with the ANZ JOURNAL OF SURGERY. From the 12485 journals listed at In-cites I was only able to match 8346 journals on journal name. That leaves me some 4000 to match manually, or find out what went wrong.
What I really wonder is, how is it possible that all these little name variations, journal abbreviations differences and other mismatches are possible for a suit of products from a company that breathes databases. A company that has only data in its veins, that sweats information. A company that claims knowledge.
We all rely heavily on their products.

17 oktober 2006


You know it is there, but you can’t search for it

Praise the Web! An avalanche of databases have become available and above all user friendly forms to search these databases. Commercial database providers have adopted this trend, and nowadays I can’t imagine professional databases that are not available through one of these ubiquitous web forms.
But so now and then I long for a command prompt on some of these databases. It happened to me today. I was inspired by an article from Adkins and Bud (2006) to have a look at the Scholarly productivity of Dutch LIS faculty. I was once quoted on this answer “I am well and truly under the impression that there is no Dutch Academic LIS research environment whatsoever. It sounds a bit harsh, but so be it”. I should perhaps have made the exception for the researchers at CWTS and Loet Leydesdorff from UVA, but they are more IS rather than L and that was and is my perspective when I made this comment.
So I wanted to have a look at the output from Dutch LIS departments. I know there is a group on book and information science at the UVA, but furthermore this whole branch of science was and is a mystery to me. Thinking so much about bridges between science and practice.
How to gain insight about the best of Dutch LIS output quickly? Web of Science of course. Darlin or LISTA were not of any help really. I was aware that you could download the subject categories from the Journal Citation Reports from WoS, but until today I never searched for these in WoS. This is perhaps not a routine search but it should be possible. It should, shouldn’t it? Well, rather unfortunately, subject categories can’t be searched in WoS. In JCR, the analytical counterpart of WoS, there is the category of “Information Science & Library Science”. Some 55 journals are listed in this category (not all of them appear scholarly to me, i.e. having peer review). But still the annoying thing is that you can search for this category in WoS, but only download this information after you have found it. Odd isn’t it?
So what to do? Quite simple. Download this journal table from JCR, this table comes standard with abbreviated journal titles. Look up the exact full titles, and include these journal titles in a WoS search combined with the Netherlands in the address field.
In Dialog, the command prompt world, you could have searched simply the SC= field. To slap you in the face, when you download those LIS publications the SC tag is there, containing the “Information Science & Library Science”. But it is not searchable using one these user friendly search forms in WoS. Not even using the field tags in the advanced search box where you can construct quite complicated searches. Isn’t that a shame that the information is there but you can’t search for it because the designers of the query form did not think about it? I am afraid that WoS is not the only exception. And that is a real pity.
When the googlification of databases through simplified search forms means that you can’t search for the properly organized wealth of information within these databases we have a serious problem.

Adkins, D. & J. Budd (2006). Scholarly productivity of U.S. LIS faculty. Library & Information Science Research 28(3): 374-389. http://dx.doi.org/10.1016/j.lisr.2006.03.021.

