04 november 2007
This is my last post
After a long period of hesitation, I have finally started a separate blog in English. The rationale for a separate blog in English is described in the first post at my new blog.
Since I am the least geeky biblioblogger around, I have not made the effort to transfer all English posts from this blog to the new blog and installed proper redirects. No. My previous posts in English will remain on this. Thus my brand new English blog will be starting from scratch and will need some time to grow and gain some impact.
So my dear readers, check the bearings for information in English on library 2.0 developments and library innovation in the Netherlands, reviews of databases or scientometrics go over to wowter.net and subscribe to the new feed.
30 augustus 2007
John MacColl's presentation
29 augustus 2007
Open Source Software and XML Workshop by Eric Lease Morgan
Eric Lease Morgan actually asked some of the participants about their objectives to attend his workshop. What I just described was also what I answered on his question. Perhaps another objective was that I could use some of the stuff I learned today and apply that on my own little bibliographies, or perhaps even my websites.
The workshop was started with some OSS evangelism. While he was spreading the word, I really wondered what kind of OSS tools our guys and galls were using in our systems. I really don’t know, whereas we have a completely independent in house developed LCMS. Have to find out though.
That Eric is serious about OSS, is clear from the fact that all his material used for this workshop is freely available at http://www.tilburguniversity.nl/services/lis/ticer/07carte/publicat/oss-and-xml.zip. It includes, manuals, software and exercises. So if you are really interested you can go ahead. I think I you search a little bit around the same stuff can be found at other places as well (Lockss we call that). What you don’t get, though, when you do it yourself is Eric’s humoristic and enthusiastic way of presenting seemingly complicated matters. He is a gifted teacher.
Further on in the morning we did little exercises on writing and reading MARC records, extracting them from the Library of Congress, building a database of MARC records, indexing it and search the database. Interesting assumption on his part is that he assumed that most library catalogs were based on MARC records. That might be the case in the USA, but is not necessarily true in Europe. But this did not really matter for his exercises or the purpose of the exercises.
In the afternoon, we got around to XML. It really covered the mere basics, what I found interesting were some of the exercises where we actually transformed and presented the same texts (files) with different xsl or css. I new these things, but so far never actually did these little things myself. It was a bit of getting you hands dirty yourself. Some exercises were command line prompted, that gave those annoying little stupid mistakes. Reminding me of my days programming in Fortran.
All in all, an interesting day. We could have gone a bit deeper into the details and I would have loved a little instruction on Perl as well. There are only a limited hours in day though.
Another teacher who understands it
28 augustus 2007
A well packed day at Ticer
The first presentation was a sales pitch from OCLC. Well in the Netherlands we are always glad to receive some information from OCLC (Ohio) since we don’t receive that much information from OCLC Pica on their moves, strategies or plans. Should we therefore attend an expensive course to happily receive this sales pitch?
Robin Murray started to sell himself first, followed by outlining what OCLC actually is and does. His sales pitch was “synthesize, specialize, mobilize”. It is actually a well founded pitch, his whole story can be read in Ariadne.
To be honest he had some interesting observations and plans. But as was remarked later in the discussions, OCLC excels in plans, reports, and visions, but the actual products were lacking. Perhaps that is a little too harsh, since I really do like what they have achieved with open worldcat. But at the local or group level (considering the Dutch libraries as a group) there is a lot of misunderstanding as to how does open worldcat relate to the (expensive) service of NCC or Picarta. But I might be too stupid to see through all these things. The lecture was more of a sales pitch, some good ideas though. But the sales pitch was my lasting impression, rather than a round up of what direction the library world, is or should be moving too.
The second talk of today was a really nice summing up of the developments around the catalog since NCSU launched their Endeca powered catalog. Peter Binkley did his overview future enhancements of the library catalogue around the themes of clustering, ranking, exploiting, contributing and deploying. His presentation was very up to date. Peter’s preferred choice of new open source library systems was VUFind. Reminds me of Koha which he didn’t cover.
In the afternoon we had a really interesting presentation on the use of chatbots in German libraries. Anne Christensen covered 4 different chatbots in operation at German libraries. The idea is appealing. There are some serious costs involved, but they actually got used. It brings the fun on a library website, and that should be worth some money.
The last presentation was a tough one. A boring, albeit important subject, as identity management, and that spun out over an hour, at the end of the day. That was really testing us. And then going into some detailed technical level, it was a bit over the too much for me personally. It would be wonderful if people from our university IT service could have been present at this lecture as well. A scheme like this should be endorsed by the library and the IT department, and I am under the impression that there is a little disagreement on some points at my home university on this area. Interesting to hear about the developments in this field, but really and totally beyond my interest.
27 augustus 2007
The Science Commons and the Library: Opportunities and Business Models
Today’s talk was different though. John Wilbanks pleaded strongly for an major library role in the changing face of scientific communication. That is encouraging. However, as to the exact role that libraries should take, there isn’t a single blue print yet. The Science Commons have some interesting examples of text mining initiatives on medically oriented databases, Bibliographic databases in combination with protein databases en genetic databases. I think it is an illustrative example of technologies and expertise most academic libraries have not easily, or readily available.
I think I can foresee technologies like this in the mid term future, but most libraries are not ready for the roles outlined by Wilbanks.
Should we just sit and wait for the blue prints for these future applications to arrive. No of course not. Wilbanks is also an OA advocate and presses the librarians to go out into the faculties and educate the researchers about Open Access and Copyright issues. Because at this moment copyright laws are hampering the kind of big science that e-science really stands for. In addition to his recommended reading I would like to add his paper in CRWatch Quarterly.
Willbanks, J. (2007). Cyberinfrastructure for Knowledge Sharing CTWatch Quarterly 3(3): 58-66. http://www.ctwatch.org/quarterly/articles/2007/08/cyberinfrastructure-for-knowledge-sharing/
Ticer, first impressions
Tomorrow lots more.
A tip to fellow bloggers: Use the Ticer or Ticer07 tag on your posts. I am collecting them on del.icio.us so everybody can have their read. Admitting at once there are a large number of dutch posts. Ecobibl has been most active up untill now (In Dutch).
02 mei 2007
Er is nogal wat gebeurd de laatste maanden met dit blog. Deze cijfers zijn bijna verplichtte kost voor alle SEM’s en SEO’s, stel dat het een van je betalende klanten overkomt? In februari begonnen de bezoekersaantallen terug te lopen. Wat? Het halveerde. Neen, een derde van de tot ddan toe gangbare aantallen! Het verval begon precies op 8 februari. Maart werd nog slechter. Pas na de live-blog actie tijdens de CWIS dagen trad het herstel op. Wat mij het meest verbaasd is het gedrag van Google. Normaal een betrouwbare aanjager van bezoek, maar in de periode van 8/2 tot 18/3 slechts een derde van de normale aantallen bezoekers. Op de een of andere manier lijkt de afname in Google afkomstige bezoekers gecorreleerd te zijn aan de andere bezoekers, of aan het aantal geschreven posts. Echt duidelijk krijg ik dat niet. Ik weet dat je je nooit afhankelijk moet maken van een enkel kanaal voor bezoekers, daarom houd ik ook van mijn RSS lezers (thans de 700 gepasseerd!) de bezoekers die er via de tags komen, en vooral via de vele verwijzingen via andere blogs.
Toch blijf ik me verbazen over die dip in februari-maart. Het herstel in bezoekersaantallen lijkt echter onderweg.
01 mei 2007
Most of the time Gmail does a fair job on recognizing what's spam and what's not. However, today it seems a bit harsh on its own brethren.
26 april 2007
Balanced Libraries, the book has arrived
According to de National catalogue (NCC) we are the second library in the Netherlands which added this book to their collection. The other one being Groningen, albeit it doesn't show up in their own catalog yet.
Our catalog record for the book can be found here. Via SFX the catalog record links through to discussion pages at Walt at Random and the linked list of referenced blogposts.
Luckily we have a short holiday next week. You may guess who is borrowing this book currently.
19 april 2007
Food for thought
18 april 2007
CIL2007 from a distance
16 april 2007
What is the value of a Web 1.0 company?
In the defining Web 2.0 post by Tim O'Reilly "What is Web 2.0" the difference between DoubleClick and Google AdSense is listed as the first and perhaps most exemplary Web 1.0 and Web 2.0 undertakings.
Since Google acquired DoubleClick last Friday, we only should realize that there is still a lot of value in Web 1.0
28 maart 2007
ISI Web of Knowledge consortium day in the Netherlands
Along the way we picked up some interesting bits and pieces as well. Such as the fact that the coverage of journals by all three ISI databases has increased from 8700 scholarly journals to 9200 journals. Despite the extensive explanation of journal selection by editorial committees it was admitted that the number of French and Spanish language journals increased under pressure from library consortia in those geographical areas. Let's assume that they included only the top journals in those languages.
The unique author identification aids which have been available since the end of last year has now finally moved beyond the authors listed in ISIhighlycited only. Apparently this is now available for some 180,000 unique authors that have collected at least more than 1000 citations each. For the science commons we have to wait a little longer.
We also heard that the term "correcting for self citations" in the fairly recent (and impressive) citation reports is in fact a little misleading. Since it only corrects for the citations from the journals in the original search results. Should have worked that out for myself before, but it reminds me of the discussions we had with Elsevier on Scopus in Utrecht last year.
It was a pity that the basic grounder on general searches took so long, that we hardly covered cited references searches. It passed the screen a few times though. Interesting to not that the marketers from ISI still talked about the citation look up results, and then always wanted to loop up the full search of citing references. I mentioned to them that many researchers are actually only interested in the citation lookup results and want to have a simple and direct export function from those results into, for instance Excel. I was under the impression that my arguments didn't make a big impression really. So we will continue for some time with our users complaining about the difficulty of downloading the cited reference search look-up results in to some other software (excel preferably).
Another nice one, well hidden in the depths of WoS, is the possibility of RSS feeds in addition to the e-mail alerts. With the latter I was familiar with. But at a certain point my vanity searches expired and I didn’t bother to extend anymore. The nice thing about the RSS alerts is that they don't expire. You need to register with ISI though, I really wonder how many users have profiles on ISI. ISI couldn't tell. I wonder if that is included in the usage reports. Something to check at a later date.
In the afternoon there was some more attention for EndNoteWeb. What really amazed me is that there was no EndNoteX account required. Which somehow was the impression that I had from all their advertisements. My neighbour was under that impression as well, so I wasn't the only one. But apparently we can make an EndNote Web account because the university has an WoS license. Interesting to hear from the audience all kind of little problems that I experienced myself as well. Toolbar configuration problems, login in to some external databases (which was later confirmed as an existing bug). Well, personally I can't get really serious about EndNote Web, but perhaps useful for beginning users. I will grill it more thoroughly in the future though.
The best was saved for the end. We got a look at some mock-ups for the major overhaul of Web of Knowledge that is planned for July this year. The colours are army green and soft yellow. The main pages focus on cross-search, and simplified boxes on the first screens. The refine search options will move from the top to the left, and some refine options are shown more clearly (more like Scopus?) but still offer more options to refine than Scopus does at this moment. The busy menu that appears on the right hand side of the screen is either much quieter or disappears. Can't remember exactly anymore. Cited reference searches are still similar to what they are at the moment. They are not going to improve their indexing, they are not going to correct citations when the mistakes are obvious. In a few years time you have to remember that author names were once only 15 characters long, then 18, than included diacriticals and spaces, and at some point in the future will include first names on some occasions as well. I really wonder if you change indexing policies, you shouldn’t try to correct as much as possible the repercussions of this change for your historical data as well.
And in the end they will still carry the brand ISI, and the databases SCI, SSCI and A&H but that is for historical reasons only.
Update Ecobibl was er ook en heeft een verslag geplaatst.
22 maart 2007
Google Sandbox Effect
When I started a new blog about the moving of our Library, at the request of the library, I asked for a domain on the library website. Starting a blog is allright, but some impact is to be desired. So A http://library.wur.nl/ domain would be desirable.
However installing wordpress on the library server was too much of a hassle.
Posting to a FTP site within the firewall was out of the question, so a redirect was proposed. And how useful that is. Look at the SERP for the simple query [forumgebouw verhuizing]. The first hit in Google is found at our library website whereas the real blog website is only to be found are rank 34 or so.
Just another example that the Google Sandbox is for real. So the bottomline, when you want to use a blog for promotional activities: Start early, or find other means to avoid the Google Sandbox.
08 maart 2007
The future of bibliographic control
They took my blogpost seriously, and their next meeting is tomorrow, or today -by the time you are reading it- in Mountain View, California. It is open to anybody. Just hop on the bus, and attend. If only the Concord would still fly, I could make it.
Can't get a real impression from the program, or the background paper the focus of the meeting. There is a lot to be said on those various issues.
Really interesting is the passionate plea of Karen G. Schneider on the ALATechsourceblog about her feeling on this meeting. It shouldn't be about control, bibliographic libration is a term I like much better. It is about how to navigate this sea, ocean or tsunami of information musch better. It is a very worthwile post. She give's some ideas to ponder a little further.
07 maart 2007
Peter, the previous Director of Digital Library Technologies of CDL and current Director of the Digital Library Federation, shares some of his doubts about the deals that have been struck so far. And the way it has all been established.
Hattip: Lorcan Dempsey
Flickr plz help Gerard
We wonder in the Netherlands if there is an efficient Yahoo! or Flickr webcare team in action. Please get in contact with Weblogzonderhaast
Update (March, 8th): It helped. Have a look at Weblogzonderhaast
02 maart 2007
Open letter to David M. Leslie Jr. and Meredith J. Hamilton
I just read with interest your article on standardized citation styles in Serials Review. I can't agree more with your article. You addressed the issue from the time spend on writing and correcting reference lists in and for journal articles.
Another compelling argument however, is missed impact because of erroneous citation scanning by institutes like Thomson Scientific (ISI) when they capture references for their Web of Science. And recently Elsevier, they have to do a similar job for their Scopus database. The scanning programs do a fair job, but errors do occur. We all know by looking at the cited reference search results lists in the Web of Science. These errors are partly caused by the many different instructions to authors stipulated by the thousands scholarly journals out there. Errors in citation data is missed impact, is reduced chances of promotion or scholarship etc....
The entry of Elsevier in the arena of citation data is therefore interesting. On the one hand they have to recognize all the different reference styles because they publish electronic journals and want to link out to the full text wherever possible, secondly they want to capture citation data for their Scopus database. As a publishers of some 1800 different titles, with probably about 1800 different instructions to authors the are the most influential party to take steps on your idea on standardizing these rules.
Another interest I have in this matter, as a subject librarian we train students and staff to use EndNote. EndNote X comes with some 2,500 different journal styles, whereas we as a library subscribe to some 10,000 different titles. Chances are small that an EndNote style is already available for a specific journal. Of course you can compose your own styles. We do that quite often. But it is a frustrating experience. Instructions to authors often diverge from the actual reference list in the journal, they are often incomplete. And indeed, they don't match the modern metadata standards.
PS, I will post this on my blog (http://www.wowter.nl/blog) so Elsevier can read it as well.
David M. Leslie Jr. and Meredith J. Hamilton, (2007). A Plea for a Common Citation Format in Scientific Serials, Serials Review, 33(1): 1-3.
http://dx.doi.org/10.1016/j.serrev.2006.11.009 (Subscription required)
In humour: Serials review is an Elsevier imprint.
01 maart 2007
I see a link with updating frequency. I have been too absorbed with my work to get around to blogging regularly. Okay I blog mostly in the evenings, but even then work cropped up. Hopefully next week will be a better month.
27 februari 2007
Do we need the evidence for library2.0?
My point is, I see two important movements at this moment in librarianship. Library 2.0 is the one container of ideas and EBL is the other. To my knowledge I have not come across many, or any, discussions about both these issues in the same article or blogpost.
There were two occasions that prompted me to think about these two subjects and their possible relation. When I gave a presentation about library 2.0 some weeks ago, a colleague of mine jumped on the brakes. "It is all very nice, what you presented, but is it working ?" is what he asked me. "Well, we should try things out and see what happens", was my feeble response.
There is possible some evidence, scattered in blogs and wikis. But no one has systematically reviewed, the scanty evidence yet. Weblogzonderhaast presented some data for search actions from their library toolbar. Some time ago I presented some data on the personalization functionalities of our libray services. There is possibly more data to be found, but you have to look very hard. Crawford commented on some figures for uptake of IM reference which was about 1-2% (Crawford, 2006).
On the other hand I was introduced to the concept of Evidence based librarianship in a very interesting presentation by Andrew Booth during the 6th performance measurement conference in Durham in 2005. His approach made sense to me. As a researcher, I liked the outlined approach. As an avid reader, I practiced already to look for applications that worked for others, and see if we can apply this in our practice.
It was a post on a Dutch library discussion list, which asked for examples from the Dutch library world that were based on EBL, that actually triggered me to think about the question for more data supporting to invest in library 2.0 technologies more heavily. The discussion list did not give any clues, so I have to look a bit harder for the data myself.
The trade off, is of course, when we all wait for the data first to appear, nobody will start with innovations. So we need our follow our instincts, gut feeling and nerve to charge ahead, but not without gathering data to support our decisions. Even if it was only with hindsight that we can draw some conclusions.
Booth, A. (2005). Counting what counts: the link between Performance Measurement and Evidence Based Information Practice. 6th Northumbria International Conference on Performance Measurement in Libraries and Information Services, Durham. http://northumbria.ac.uk/static/powerpoint/Booth.ppt
Crawford, W. (2006). Finding a balance: Libraries and Librarians. Cites & Insights: Crawford at Large 6(9): 2-19. http://citesandinsights.info/civ6i9.pdf
24 februari 2007
Important OA proponent becomes minister of education in the Netherlands
Policy makers at Dutch Universities are thrilled with his appointment. Since he was quite a popular columnist a lot of his opinions are well known. So far, his ideas on OA, and his active participation, and his open rebellion against copyrights of the big publishers, have not been highlighted yet. It is about time to do a review about this subject. It will be really interesting to see how he deals with the OA issue on the political agenda.
Sometimes lists can drive you crazy
Well, in theory no sweat.
When you try to work out one or two articles you can run already into some little annoyances, when you one to look-up thousands of journals ISI can drive you mad.
Once you have established that researcher x has published an article in the American Heart Journal and found y citations. The next step is that you look up this journal in ESI. You have to establish in which field the journal is categorized according to ESI. In ESI you have to look this up using the journal abbreviations, quite simple the abbreviation of this journal is AMER Heart J. Slightly odd since this journal is abbreviated in the Journal Citation Report as the AM Heart J. But a another article in the American Journal of Critical Care should be abbreviated as AMER j crit care in ESI. Similar happens with Advances in Advances in Atmospheric Science and Advances in Ecological research. In the first instance you should abbreviate Advances as Adv and in the second instance as Advan. These are mere two examples, doing this manually you run in hundreds of examples.
Ok, be smart don't do it manually. Let's automate. At In-Cites there is a list with all journal categories available. Really nice of Thomson to list a really handy help tool outside the product itself (Yes there is a help file with journal abbreviations available in ESI, but you can't search that list directly, you have to browse, and heck they miss the journal categories in that help file altogether)
Working with the list at In-Cites isn’t a real joy either. Have for instance a look at Abacus, that journal is listed twice at the In-Cites list. Not too much of a problem you might think. But when you want to use a database to make lookups of journal categories and baseline data a bit less labour intensive the best way is to use ISSN to couple the various tables.
Sounds simple. Use the table with all journal categories from In-Cites and match that on the full title against the Journal Masterlist of ISI where they have the ISSN listed as well. Soon you find out that the AUTRALIAN JOURNAL OF GRAPE AND WINE RESEARCH from In cites doesn't match with the AUSTRALIAN JOURNAL OF GRAPE AND WINE RESEARCH from the Masterlist because a stupid spelling error. Or the A N Z JOURNAL OF SURGERY doesn't match with the ANZ JOURNAL OF SURGERY. From the 12485 journals listed at In-cites I was only able to match 8346 journals on journal name. That leaves me some 4000 to match manually, or find out what went wrong.
What I really wonder is, how is it possible that all these little name variations, journal abbreviations differences and other mismatches are possible for a suit of products from a company that breathes databases. A company that has only data in its veins, that sweats information. A company that claims knowledge.
We all rely heavily on their products.
23 februari 2007
The most interesting post related to this subject appeared today on the Feedburner blog "Burning Questions" itself. They give a fairly comprehensive overview of the most popular webbased feedreaders. myYahoo as measured by clickthroughs seems to be the most popular (but they only syndicate headlines). This is followed by Google, Bloglines and Netvibes. When measured by views Google Reader, Bloglines, NewsGator and Netvibes, in taht order, account for 98% of all views. Google by far and large the most popular.
The numbers reported here are a bit in contrast with a fairly recent post by Hitwise. Hitwise measure something altogether different, they measure web traffic. For webbased readers however, you'd expect some correlation. But on january 18th Bloglines was by and far the most popular webbased reader in the USA, according to Hitwise. LeeAnn Prescott writes "Google Reader has grown lately, but as of the week ending 1/13/07, it had only 1/13 of the market share of visits of Bloglines."
I can't believe Google Reader makes up this much in such a short time period. I myself was wondering about geographical influences. I notice on my feed that Netvibes has become the most popular reader. On some other Dutch blogs Netvibes is quite popular too, but not as popular as on mine. Still all substantially higher than in the USA. Might this be caused by the fact that it is a French company?
Just a question.
I have tried Netvibes for some time as well, but I think PageFlakes is actually more impressive since it allows you to share your resources more easily.
Interesting to note that Pandia just did a qualitative review of RSS readers. The sentence I liked most was their criticism on Google Reader: "I also often see the Google Labs test tube logo, which is displayed when Google Reader needs some seconds to work on a request." It really drives me mad so now and then. Where they can search billions of webpages in a fraction of seconds, and indicated my personalized results. The same Google can't resolve a few feeds (some 300) in less than seconds..... Otherwise the choice of Rojo and FeedShow seem a bit far off.
CleverClogs has also an interesting post on this subject.
Update: The RRW write-up includes Pheedo stats for comaprison
21 februari 2007
Marjolein blogged yesterday that the tool has become available to the general public as well. So I decided to give it a go. Answers.com is one of the better collection of instant reference works on the Web. Aggregating explanations from various dictionaries and encyclopedia's.
I doesn't work for Dutch words though.
20 februari 2007
Hattip: Dutch cowboys
18 februari 2007
Finally, Google and Feedburner talk with each other
As an avid watcher of (my) blog statistics I always thought it was a good idea when Google cooperated with Feedburner on their statistics program. This weekend it finally became reality. my subscriber number soared through the 500, registering an absolute maximum of 530 last Friday.
I known it is all relative.
I am probably the most frequently subscribed reader, with a Bloglines account, Netvibes, Google reader and a Google personal account. But studying the previous Information Professional, I know that Eric S. is using Feedburner and Bloglines to follow this blog. So there will be, and should be some inflation in these figures. My inquisitive colleagues are experimenting, testing with all kind of means to read and try RSS readers for their patrons.
But it is interesting to watch this jump in reader base anyway.
On the other hand, as a distant Internet business watcher, it is interesting to see Google approaching a small (but interesting) player in the Web statistics business. Will there be another successful takeover in the make?
14 februari 2007
The Directory of Open Access Journals has launched a membership program. Peter Suber strongly supports their membership program. Given the primordial stage they are still in, our library is going to support this request anyway. We are not yet on the list of member libraries, but sooner or later we will be. At least for this year, perhaps the next. Personally I am a bit weary of this initiative.
Consider for a moment seriously what the essence of DOAJ entails. It is a directory of open access journals. None of these journals is hosted at DOAJ. Not at all. In its essence it is a collection of, authoritative, high quality, links to peer reviewed, scholarly open access journals. A collection of some 2500+ links. And oh yes, they have build a search engine for some of the contents of these journals. Yes only partly. Around their website there is some information on OA and the OA movement, but that is about it.
What would it cost to maintain this all?
There are other initiatives in this arena as well. Perhaps not as well known but they are around. I have pointed out some of these before already. The most interesting I find Livre, Open J-gate, and Regensburg. These for the directories. On top of that you have many search engines that spider these collections. Scirus is probably the best of the rest (that is after Google, which does it very badly) And Google Scholar, we don't really know.
However, as a library we will support the DOAJ membership initiative.
What will we get in return? DOAJ Membership Benefits
- Acknowledgement as a DOAJ Member on the DOAJ Membership Pages, including link to your/your institutions/your company's homepage.
- Access to the list of recently added titles
- Subscription to e-mail list for DOAJ members
- Access to list of removed titles
- The right to use the DOAJ membership in marketing activities.
A backlink from DOAJ is nice since they have a substantial pagerank (8/10). Yet another mail list, is not what I am waiting for. Lists of newly added or removed titles sounds interesting, but in essence we are relying on the SFX knowledgebase. It is not only relying, we subscribe to it, and expect Ex Libris to maintain this knowledgebase properly (which they have done very well to date) -besides are they a member already, and what would their membership cost?-. The use of DOAJ in marketing activities is not really what we are waiting for.
What do we really want?
Give us some feedback on usage statistics. Please!
Most of the listed journals at DOAJ don't have the capacity to provide all various users or user groups with feedback on their personal or group usage. Libraries are more and more confronted with figures and data to back up their decisions and expenditures. We need to justify what we do each and every day. If DOAJ could provide us with usage reports for our institute we would be much more interested in their membership program. As far I understand they are not in the position to provide official Counter compliant reports, since they don't host. But they should be able to provide us some meaningful data. It is not only to justify they cost benefit relations from our point of view. It should provide us with data to justify the cause of OA as well. Show our users how much they are actually using these 'free' journal articles in comparison to the established publishers.
I think a membership program is in its place when we get some more solid data in return.
08 februari 2007
Avoiding personalized Google results
Is there really no escape from Google personal results? So you can at least compare how Google optimizes your personal search engine results page (SERP)? I think there is a small loophole. Cleaning your cookies should help, but Google is storing some of your search and click behaviour on their sites as well. The other trick is to use a specific data center. When I use Google at http://126.96.36.199/ I do get a slightly different ranking of the SERP than using my standard http://wwww.google.com/ig or a Google toolbar search for the same keyword.
What I haven't checked yet, is whether Google is storing my IP and search actions there as well, and using that "against" me. They probably do. Well there are many clusters of IP addresses of Google data centers, and it seems improbable to me that Google is exchanging IP and search information between all data centers.
01 februari 2007
Managers love these. That applies to the managers of our university as well. They really like the ESI rankings (albeit we're loosing some prestige) or those from Newsweek and THES. A new type of ranking is based on webometrics. Link analysis of websites that is. A group of researchers from Spain has been quite active in this field. They posted a preprint of their analysis from the European academic Web on E-Lis. Interesting reading.
In Europe, the UK and Germany are the two most inter-linked academic Web communities. The Netherlands sits somewhat closer to the UK. The UvA and VU are two of the better linked universities in the Netherlands. Wageningen UR is a midget somewhat distant from the center where the real action takes place. This is perhaps partly due to the older web address the researchers have used in their investigation. But looking closer at their Website Webometrics which is part of their ongoing research, reveals some real problems for the Web-identity of our university.
As main university website they have still listed our old domain, but next to that there is Larenstein (perhaps rightfully so). And they have listed a portal Bioinformatics at Wageningen University and the Graduate School Experimental Plant Sciences as separate identities as well. Our position as a combined university and research institute is even more diluted by the fact that some of the research institutes are treated as separate distinct identities as well. To mention a few: Alterra-ILRI, CIDC (listed at two Web adresses) CRC, RIKILT also listed under two addresses, Wageningen Feed Processing Centre, Wageningen Institute of Animal Sciences (should actually be listed as a school). Wageningen NMR center, Wageningen UR and ISRIC also with two addresses.
There is plenty of room for criticism on the Spanish website and their selection of websites of Institutes. They also list redirected pages. Our university website(s) don't make matters any easier for these foreign investigators. There is for instance no sitemap available (would improve spidering of the website by search engines as well). Furthermore there are still too many seemingly independent websites that bear hardly any relation (in their domain) with Wageningen UR. Take for instance WIAS, VLAG or Plantenwetenschappen. They are one hunderd percent related to the University, but nothing in the web address (or layout) that shows for this relationship. There are whole legions of exotic websites such as Syscope, de Natuurkalender or IBL etc.…These websites should be used to improve the web presence of our University by making them integral part of the WUR domain.
What does it matter?
Well those Webometricians do their research. Fair enough, but that is not only academic inquisitiveness. Those are not mere theoretical exercises. Popular search engines work on exactly the same principles. Our web presence is in dire need for improvement. Look for instance at the traffic of three of our major domains. Wau.nl generates more traffic than Wageningenuniversiteit.nl. And we had a very expensive operation to move everything to a single web domain, with a brand new layout, and it was declared a success. Only when you look at the traffic at the previous link over a somewhat longer period you get some interesting graphs. Since the change in December 2005, total traffic plummeted, and the Wageningenuniversiteit.nl site never attracted really more traffic than the old wau.nl site. It is now more than a year after the whole operation and all kind of redirect pages are still afloat and attract a lot of traffic. Improving visibility and performance of a single wur domain seems badly needed.
But what really pleases me though, our library website generates 39% of the all WUR traffic. The library in the heart of the organization that is. WoW!
This is of course a laughing farmer with a very serious toothache.
Ortega, J. L., I. Aguillo, et al. (2007) Maps of the academic web in the European Higher Education Area - an exploration of visual web indicators. E-LIS http://eprints.rclis.org/archive/00005038/
How embarrassing, for such a developed country, the statistics on natural disasters that were published by the UNISDR. Netherlands ranked 4th on the lists natural disasters by number of deaths.
Apparently we can't keep our elderly and weak people protected from the heat. But even more amazing I found the fact reported in a Dutch newspaper that the data collectors had great difficulty collecting reliable information from the Netherlands.
It has never been so busy on WoW!ter
January was a really busy month on this blog. Google Analytics registered some 4,500 visitors that looked at around 6,800 pages. 3,500 unique visitors! The busiest day was Wednesday the 17th when I announced my plans for a new review of the Dutch biblioblogosphere. A message posted on the Nedbib-L, the most important discussion list for Dutch Librarians, resulted in a true spike of direct visitors.
21% of the visitors came from other countries than Netherlands or Belgium. Visitors came from 59 different countries. In the beginning of January I noted an unusual peak of visitors from Taiwan. Over the last couple of days I see all Italian universities dropping by on a single post of this blog, they all come directly to the blog. Strangely enough they did not visit this post, which is on the same subject. Apparently some e-mail traffic going around there, since a few are referred from e-mail hosting services.
Netvibes has finally overtaken Bloglines as the most popular feedreader for subscribers to the Feed. In the beginning of the month there were 156 Bloglines subscribers (using the Feedburner feed) and 123 Netvibes subscribers, the figures are now 168 vs 161. The success of Netvibes over Bloglines also confirmed by the referrals as measured by Google Analytics 147 vs 145. There is actually no real need for either system to click through to the blog since the posts are syndicated as a whole.
31 januari 2007
Citation analysis for research evaluation
Did I post yesterday the recommended reading list for our 1/2 day course on citation analysis already. Herewith the slides that I used over the whole morning. It was quite intensive, but I really enjoyed the discussions we had with the participants. It was also great, to greet a really external course participant (and blogger) as well.
30 januari 2007
Citation analysis and research performance, a reading list
The two most current ‘bibles’ for citation analysis
Moed, H. F. (2005). Citation analysis in research evaluation. Dordrecht (The Netherlands), Springer. 346 pp.
This book deals with the evaluation of scholarly research performance. Its principal question is: how can citation analysis be used properly as a tool in the assessment of such a contribution? In order to be used properly as a research evaluation tool, it is essential that all participants have insight into the nature of citation analysis, how its indicators are constructed and calculated, what the various theoretical positions state about what they measure, and what are their potentialities and limitations, particularly in relation to peer review. (from the cover)
Moed, H. F., W. Glänzel, et al., Eds. (2004). Handbook of Quantitative Science and Technology Research : The use of Publication and Patent Statistics in Studies of S&T Systems. Dordrecht (The Netherlands), Kluwer Academic Publishers. 800 pp.
And the most important journal on this subject:
Scientometrics ISSN 1588-2861, Springer.
Roth, D. L. (2005). The emergene of competitors to the Science Citation Index and the Web of Science. Current Science 89(9): 1531-1535. http://www.ias.ac.in/currsci/nov102005/1531.pdf
This article points to some interesting (free) resources for citation data.
Meho, L. I. and K. Yang (2007). Impact of data sources on citation counts and rankings of LIS faculty: Web of Science vs. Scopus and Google Scholar. Journal of the American Society for Information Science and Technology.
Web of Science
Jacsó, P. (2007). Web of Science. Peter's digital reference shelf Retrieved. 24 January, 2007, from http://www.gale.com/reference/peter/200701/wos.htm.
A recent review highlighting the new Citation report features and the h-index
Jacsó, P. (2006). Scopus revisted. Peter Digital Reference Shelf Retrieved 27 June, 2006, from http://projects.ics.hawaii.edu/~jacso/gale/scopus-revisited/scopus-revisited.htm.
Jacsó, P. (2005). Google Scholar. Peter's digital reference shelf Retrieved Oct. 2005 http://www.galegroup.com/reference/archive/200412/googlescholar.html.
Gerritsma, W. (2006). Wetenschappers gewogen : een systeem voor citatieanalyses in de praktijk. Informatie Professional 10(10): 12-17. (in Dutch)
Hirsch, J. E. (2005). An index to quantify an individual's scientific research output. PNAS 102(46): 16569-16572. http://arxiv.org/abs/physics/0508025
van Raan, A. F. J. (2006). Comparison of the Hirsch-index with standard bibliometric indicators and with peer judgment for 147 chemistry research groups. Scientometrics 67(3): 491-502.
Seglen, P. O. (1997). Why the impact factor of journals should not be used for evaluating research. BMJ 314(7079): 497-502.
Opthof, T. (1997). Sense and nonsense about the impact factor. Cardiovascular Research 33(1): 1-7.
Dong, P., M. Loh, et al. (2005). The "impact factor" revisted. Biomedical Digital Libraries 2(7).
Shanghai Jiao Tong University http://ed.sjtu.edu.cn/ranking.htm
THES World University rankings http://www.thes.co.uk/worldrankings/
van Raan, A. F. J. (2005). Challenges in Ranking of Universities. First International Conference on World Class Universities, Shanghai Jaio Tong University, Shanghai, June 16-18, 2005.
27 januari 2007
Why Scopus doesn't add substantially to the number of citations found in WoS
How come we are asked?.
It is actually quite simple to explain. Garfield (1997) showed already that 2,000 journals of the Science Citation Index generated over 80% of all citations. Web of Science as a whole covers some 8,700 journals (interesting to sort out how many exactly, since this appears a disguised number as well). Scopus nearly doubles the journal base compared to WoS. But considering the fact that WoS already covers the most prestigious, important, cited journals, the doubling in journals only increases the total number of citations a wee bit.
Some subject specific databases such as SciFinder Scholar for chemistry, or PsychInfo for Psychology/Psychiatry will find more citations for journal articles on their domain since they have an even wider journal base on their domain than either WoS or Scopus.
I have tried to indicate this in the following figure.
The WoS square has a journal base of 8700 journals and attract in total a certain amount of citations. The journal base of Scopus is nearly double that of WoS, but not overlapping. The CAS (SciFinder Scholar) has a smaller partly overlapping database with WoS and Scholar, but (not properly indicated) a substantial number of journal are unique to CAS. On that smaller domain you are likely to find a few more citations.
I am not yet happy with the figure, but I hope it helps to illustrate this whole explanation.
Garfield, E. (1997). The significant scientific literature appears in a small core of journals. The Scientist 10(17): 13. http://www.garfield.library.upenn.edu/papers/currscience.html
How many scholarly journals are out there?
Dong et al. (2005) start their article as follows: "The number of periodical peer-reviewed scientific publications is conservatively estimated to exceed 16,000 worldwide; nearly 1.4 million articles are published every year". They based their number on studies by Mabe published in 2001 and 2003. Another article (Ioannidis, 2006) Starts as follows "Despite a very large number of scientific journals (probably exceeding 100,000 worldwide), the concentration of scientific information is skewed to a minority of journals that publish the majority of the articles (Bradford's law) and receive the majority of the citations." It is the unsubstantiated remark between brackets "probably exceeding 100,000 worldwide" that really struck me.
Way back in 2003 I started a discussion with Stevan Harnad on the number of peer reviewed journals that existed at that moment. Based on Ulrich I came to the conclusion that there were about 18,846 academic journals out there. In that same discussion a manager from Ulrich came up with the figure of 24,116 refereed serials. Refereed serials include refereed journals as well as refereed proceedings. The last one in the thread was Carol Tenopir who has kept track of these numbers quite regularly and showed that the numbers vary a lot according to search strategy. The most comprehensive number was at that moment 43,667 academic/scholarly periodicals.
Slightly later Carol Tenopir wrote a column on this subject in Library Journal where she highlighted this seemingly simple question. She concluded "I can say with confidence that as of the end of 2003, there are just under 50,000 scholarly journals and somewhere between one-third and just over one-half of them are in digital form."
Actually, since then Harnad uses 24,000-50,000, which is more than 16,000 and a lot less than 100,000.
Dong, P., M. Loh, et al. (2005). The "impact factor" revisted. Biomedical Digital Libraries 2(7). http://www.bio-diglib.com/content/2/1/7
Ioannidis, J. P. A. (2006). Concentration of the Most-Cited Papers in the Scientific Literature: Analysis of Journal Ecosystems. PLoS ONE 1(1): e5. http://dx.doi.org/10.1371/journal.pone.0000005
Mabe, M. (2003). The growth and number of journals. Serials 16(2): 191-197. http://uksg.metapress.com/link.asp?id=f195g8ak0eu21muh
Mabe, M. and M. Amin (2001). Growth dynamics of scholarly and scientific journals. Scientometrics 51(1): 147-162.
Tenopir, C. (2004). Online scholarly journals: How many? Library Journal 129(2): 32. http://www.libraryjournal.com/index.asp?layout=articlePrint&articleID=CA374956
26 januari 2007
This post has not the usual number of links since, most of it was written at the airport, on the plain or train.
The conference was organized by Roger Mills from Oxford University Library Services. A fully packed day with eleven presentations. Was it worth it?
My main goal to attend this conference was to establish contacts with the people from Intute and look at the possibilities to make use of the Wageningen UR Library resources by Intute. Now we have moved all of our systems to the oracle database in XML format it must be easy to think of services that could harvest our electronic resources, either from the catalog or our repository. After these initial contacts, meeting the right people, it should possible to pull this through. Intute is seriously looking into these new ways of sharing and re-use of information. Let's call it web 2.0ish.
There were also two presentations from Intute on the program. One from the people working of the Health and Lifesciences (formerly Biome) part of Intute, on the new site that launched in July 2006. Quite impressive they have catalogued currently about 31,000 Web resources. Most interesting I found the fact that they are harvesting other web resources as well. That is what I came for. The second presentation was a more exploratory presentation, attempting to sketch a possible road ahead, and the opportunities that are presented by web2.0 type solutions, technology and user participation. Intute is really looking at it, and with personalization within My Intute and RSS feeds they are making inroads. The discussion which followed exposed quite some hesitation on web2.0/library2.0 in the audience. Mostly female information professionals, perhaps not yet the next gen generation in these positions yet.
The British Library launched earlier this month the UK PubMedCentral. A site which mirror NLM PubMedCentral, and is aimed at adding UK content. Since the launch the first 250 British papers have been archived. This UK content is subsequently mirrored to the USA PMC. The presentation by the engagement officer for UK-PMC was a bit confusing at times, partly he was quite new to the subject, partly because it is also a brand new service that needs to create its niche. They hadn't thought about the question of institutional repositories versus subject specific repositories yet. Well at least he didn't have the answers but was willing to take back these questions. There were plenty ideas about the possible developments with UK PubMedCentral however. Where the British Library has really worked hard was to make it easy for the researchers to submit their articles, whether it are the scientists who actually deposit versus the lab assistants, departmental secretaries or librarians perhaps, who are left to do these jobs hadn’t sunk in yet.
Quite interesting was the presentation by M. Dvray from the Mann Library at Cornell. She was being relly proud at her background as a scientist, which enabled her to take up all kind of new roles for the library as a liason between the library and the scientist. Actually a sitatuation which is very much alike for the information specialists at our Library. But what I really liked was the the projects she pointed out she was working on like their VIVO website, which is in reality a service what our we@wur should be. They were a lot further already at Cornell.
A beautiful presentation by Sally Rumsey on the brand new Oxford repository. Well they call it Archive. Oxford Research Archive http://ora.ouls.ox.ac.uk/ in full. It had a soft launch last Monday. So this was actually real news. Sally pointed out that convincing the researcher to submit their work was their hardest, but most important job. The competition with UK PMC was not making things easier in that respect.
Roger Mills did a presentation on behalf of Michael Popham on the Google Book project. Interesting that Roger pointed out that they were going to link from the catalogue to their own copies in Google Books, but that wasn't working yet. Well actually the global library community is waiting for that one to happen. If they can do that from Oxford we can o it in principle from any library catalogue. More is to be found at http://www.bodley.ox.ac.uk/google/
The most interesting presentation for me personally was the one about curation of actual data. A subject that we were discussing at our department 15 years ago as well. But still can't get our grips on today. Those days I was working with crop growth simulation models. And these are really data hungry, to verify the models. So it is a really felt need, but there are no solutions. Unfortunately Chris Rusbridge didn't have the answers either, but in principle there is potentially an important role for libraries. We only have to develop the answers and craft our niche in this area. There is certainly room for us in that area. Worth thinking about.
There were three more presentations, one on evidence based forestry, where I really missed the small hint or step to evidence librarianship, a very new and important theme in our own profession. It was not mentioned at all. A bit strange. The last two presentation were actually sales pitches, one from CABI and the other from CSA. Just before that last presentation I had to leave. To catch the tube, train, plain etc…. to get home.
24 januari 2007
Gives you at least some doubts about the inherently library eagerness to catalog all valuable resources.
Hattip: Gwen Harris
23 januari 2007
The review is very thorough and I really would love to learn some of his tricks which he uses in his database reviews as well. It was a pleasure to digest it all. One of those things that I never realized was the fact that abstracts were only included systematically from 1991 onwards. I knew they were more often lacking from the older material but this sharp demarcation line in the change of policy was not known by me. It should have been good if Jacsó had pointed out the changes in naming policy as well. Once it was 15 characters, later 18 and nowadays first names are indexed as well. All this makes comprehensive searches for long standing researchers sometimes difficult. Well, challenging at least. Especially when you're dealing with double names. "F.C.T. Penning de Vries" is one of my favourites. He can be found as DEVRIES P in the cited references. ISI has promised author disambiguation, but this has not reached the science commons yet. And Penning de Vries was not a bad scientist after all. On Author disambiguation Scopus does a better job.
The new citation reports in WoS are swell indeed. The h-index (not Hirsch index so Hirsch told us) implementation is good indeed, and very useful to apply to all kind of search results. Journals amongst others, as Jacsó did for some of the LIS journals. I also hope to see these results included in the Journal citation reports next year.
But after all this praise some grumps as well. It is a rather old one. Sloppy indexing by ISI. Jacsó came with the example of author names. Also issues, volumes and page numbers go wrong quite often. Although Moed (2005, p.175) has established that this is only in the order of about 7-8%.
We use WoS quite often within the library setting itself for collection development. We look at our authors and examine their reference lists. That runs quickly into the tens of thousands references, and we would love to see some uniformity in the journal names of those references. This is clearly illustrated when you do a search for a journal in the cited reference search. With the cited reference search you should use the abbreviated journal name (as opposed to the full search where full journal names are requested, but I am not complaining). Take for instance the AUSTRALIAN JOURNAL OF AGRICULTURAL RESEARCH, the official abbreviation is AUST J AGR RES. But don't dare to think that on entering you have found all instances of this journal. Oh no!. You have found only 33,276 citations for this abbreviation. But you missed AUST J AGR RESEARCH (1), AUST J AGRIC RES (1158), AUST J AGRIC RS (23), AUST J AGRIS RES (1) AUSTR J AGR RES (11417) AUSTR J AGR RESEARCH (1) AUSTR J AGR RESER (6) AUSTR J AGR RS (2) AUSTR J AGRI RES (24) AUSTR J AGRIC RES (127) AUSTR J AGRIC RESEAR (1) AUSTR J AGRICULT RES (6) AUSTR J AGRICULTURAL (251?) AUS J AGR RES (35) AUS J AGRIC RES (3) AUS J AG RES (19) AUST J AG RES (90) AUST J AGIC RES (3) and there are probably a few other variations I missed. My point is however, if you analyze a large number of references you inevitably end up with a lots of variations of journal names. This doesn’t only apply for this particular instance. We try to monitor the usage of about 8000 journals. With sloppy data as in the above example it becomes a real tour de force, which I would love to see better facilitated on the side of ISI. Obvious corrections in the data should therefore be done on the primary data, rather than in the software which takes place to some extent.
But to end this post on a positive note, I quote Jacsó's conclusions in full
"ISI has kept adding new content and software features through regular updates. The latest services clustering of results set by several criteria, the instant calculation and superbly informative and compact visualization of new citation measures, such as the sum of times a paper was cited (including and excluding self-citation, the average citations per item, the Hirsch-index, the almost instant display of charts for the distribution of articles and citations per year by authors, journals, organizations or topic, the exporting of these details into a spreadsheet format, or downloading to a free Web version of EndNote, represent more than a series of evolutionary steps. It is a breakthrough for those interested in citation analysis, but did not have the resources to calculate key citation performance measures, or did not have the software to format them to the whims of the publishers' manuscript guidelines."
Jacsó, P. (2007). "Web of Science." Peter's digital reference shelf Retrieved Jan 2007 from http://www.gale.com/reference/peter/200701/wos.htm.
Moed, H. F. (2005). Citation analysis in research evaluation. Dordrecht, Springer. 346p.
19 januari 2007
Interesting to point this discussion to Google (once again). Google had already for a long time the famous GoogleGuy who reacted on forum discussions and the blogosphere. Later Matt Cutts started its own blog and has organized some really great PR for Google. It is speculated that GoogleGuy and Matt Cutts are the same person. I don't know for sure and don't really care.
On my grumpy post yesterday on the erratic behaviour of Google Blogsearch, there was a prompt reaction from a Google engineer. Yes indeed, there was something wrong on the Blogsearch end of the spectrum. It had nothing to do with the tango from Blogger to Blogger Beta to New Blogger. He will investigate the matter into detail and will report back later. That's great. Jeremy I eagerly await the results.
It is also exemplary how Google tackles this whole complex marketing issue of blogs and virals. Interacting in blogs and forums was not sufficient anymore, so they had Matt Cutts starting his own blog with a strong voice. Matts' blog can also be effective in setting the stage, commanding the discussion, and in the worst case scenarios function as a lightning rod. Thereby is Matt a frequent keynote speaker or attendent of all kind of Search Engine Conferences.
In conclusion the UPC webcare team is not sufficient to improve their image. They are on the right track, but have to take it a step further.
18 januari 2007
16 januari 2007
What is going on?
Probably just another hiccup of the new Blogger.
What is badly missing though is a cure.
Has anybody any ideas?