Finding files in Mendeley

Mendeley is a great piece of Software to organize one’s collection of PDF articles. It did a fairly good job indexing the several thousand PDF files on my harddisk. Often however, the automatic metadata extraction (authors, titles, journal name) failed, and garbage information appears in Mendeley’s list of documents.

When I want to use a particular citation and cannot find it in Mendeley, chances are that the local PDF got indexed with the wrong metadata. Since I use a consistent naming scheme for my PDF files, I often know the name of the file which belongs to the reference. In such a situation, it would be really handy to be able to search for a specific local file in Mendeley to be able to update the document entry with the correct information; however, there is no such functionality.

Since Mendeley stores all the information in an SQLite database file, it is quite simple to perform the search “manually”, using the sqlite3 command line tool (or something similar), which is available for free. You can lookup the location of the database file in the Mendeley FAQ.

I use the following queries to find the database entry for the PDF file (I’m not a database expert):

SELECT hash FROM files WHERE localUrl LIKE '%TextInPDFFileName%'
SELECT documentId FROM documentfiles WHERE hash = [Hash I just found]
SELECT title FROM documents WHERE (NOT deletionPending = 'true') AND id = [ID I just found]
SELECT lastName FROM documentcontributors WHERE documentId = [ID I just found]
SELECT publication FROM documents WHERE (NOT deletionPending = 'true') AND id = [ID I just found]

The last three queries obtain information on the document as it is known by Mendeley. Using this information, I can easily locate the document in Mendeley, correct the information, and insert the reference in my manuscript.

On all platforms (including Windows), the free FireFox extension SQLiteManager should be able to perform the queries.

Linux users may try my very simple, very crude script (it requires the sqlite3 program, available in your distribution’s repository):

#!/bin/bash
# urldecode function to change the encoded characters to real ones
urldecode() {  echo -e "$(sed 's/+/ /g; s/%/\x/g')"; }
  
echo -e "$(basename ${0%.sh}) Utility to look up files that are attached to documents in the Mendeley database."
  
if [[ -z "$1" || "$1" == "-h" || "$1" == "--help" ]]; then
  echo -e "Usage: $(basename $0) SearchTermn"
  echo -e "The script searches for SearchTerm in the names and directories of"
  echo -e "the local files attached to the documents in the Mendeley database"
  echo -e "and prints information about the document as it is known by Mendeley."
  exit 0
fi
  
# Standard path where the Mendeley database is stored.
STDPATH=${HOME%"/"}"/.local/share/data/Mendeley Ltd./Mendeley Desktop"
  
SQLITE="sqlite3"
SQLITE_FINDHASH="select hash from files where localUrl like"
SQLITE_GETURL="select localUrl from files where hash ="
SQLITE_LOOKUPHASH="select documentId from documentfiles where hash ="
SQLITE_LOOKUPTITLE="select title from documents where (not deletionPending = 'true') and id ="
SQLITE_AUTHORS="select lastName from documentcontributors where documentId ="
SQLITE_JOURNAL="select publication from documents where (not deletionPending = 'true') and id ="
  
# Look for the Mendeley database
DBFILE=$(ls "${STDPATH%'/'}/" | grep -m 1 ".mendeley.com.sqlite")
  
if [ -z "$DBFILE" ]
then
  echo "Mendeley database file not found in your home directory."
  exit 1
fi
  
# Concatenate the path and the file name
DBFILE="${STDPATH%'/'}/$DBFILE"
echo -e "Mendeley database found at: $DBFILEn"
  
echo -e "Finding:t"$1
echo -e "========n"
  
# Get the hashes corresponding to the file(s)
HASHES=`$SQLITE "$DBFILE" "$SQLITE_FINDHASH '%$1%';"`
  
if [ -z "$HASHES" ]; then
  echo -e "tt*** Not found. ***"
  exit 3
fi
  
for HASH in $HASHES; do
  # Find the url and document ID for this hash

  URL=`$SQLITE "$DBFILE" "$SQLITE_GETURL '$HASH';"`
  DOCIDS=`$SQLITE "$DBFILE" "$SQLITE_LOOKUPHASH '$HASH';"`

  # A file may be attached to several documents, therefore we
  # loop through the all document IDs found.
  for DOCID in $DOCIDS; do
    # Retrieve the title(s)
    # The query is constructed so that it will exclude titles
    # of documents which have been deleted (deletePending = true).
    TITLE=`$SQLITE "$DBFILE" "$SQLITE_LOOKUPTITLE '$DOCID';"`
  
    # Extract the file name and urldecode it (see function at the top
    # of the script)
    FILENAME=$(basename $URL)
    FILENAME=$(echo $FILENAME | urldecode )
  
    # Extract the name of the subdirectory that contains the file
    # (the local "Journal name")
    LJOURNAL=$(dirname $URL)
    LJOURNAL=${LJOURNAL##*/}
  
    # Get the author information from Mendeley
    AUTHORS=`$SQLITE "$DBFILE" "$SQLITE_AUTHORS '$DOCID';"`
  
    # Get the journal information from Mendeley
    MJOURNAL=`$SQLITE "$DBFILE" "$SQLITE_JOURNAL '$DOCID';"`
  
    # If the title is very long, shorten it and add ellipsis
    if [ ${#TITLE} -gt 60 ]; then
      TITLE=${TITLE:0:57}"..."
    fi
  
    # Print the information
    if [ ! -z "$TITLE" ]; then
      echo -e "File name:t"$FILENAME
      echo -e "Local journal:t"$LJOURNAL
      echo -e "In Mendeley:"
      echo -e "  Document ID:t"$DOCID
      echo -e "  Authors:t"$AUTHORS
      echo -e "  Journal:t"$MJOURNAL
      echo -e "  Title:t"$TITLE"n"
    fi
  done # for DOCID in $DOCIDS
done # for HASH in $HASHES