Using Dates with Android Lists and SQLite

Our TeamCaptain app checks the web for changes to the schedule and news for recreational sports leagues, providing update notifications and local (cached) viewing on mobile devices. On Android, this data is stored in a SQLite database. As the SQLite page on Datatypes In SQLite Version 3 states, there isn’t a specific storage class for these:

SQLite does not have a storage class set aside for storing dates and/or times. Instead, the built-in Date And Time Functions of SQLite are capable of storing dates and times as TEXT, REAL, or INTEGER values:

  • TEXT as ISO8601 strings (“YYYY-MM-DD HH:MM:SS.SSS”).
  • REAL as Julian day numbers, the number of days since noon in Greenwich on November 24, 4714 B.C. according to the proleptic Gregorian calendar.
  • INTEGER as Unix Time, the number of seconds since 1970-01-01 00:00:00 UTC.

Applications can chose to store dates and times in any of these formats and freely convert between formats using the built-in date and time functions.

I chose to store date values as an integer, but when Binding to Data with AdapterView using an Android SimpleCursorAdapter, the result isn’t so useful to the user:

TeamCaptain original Date view

A common solution for this seems to be to create a custom CursorAdapter like the example in this answer found on StackOverflow. This works, but I decided to choose a different that I think is more reusable: creating a custom ViewBinder for binding date fields to an Android TextView.

public class DateViewBinder implements ViewBinder {
	private SimpleDateFormat mFormat;
	private int mTargetViewId;

	public DateViewBinder(SimpleDateFormat format, int targetViewId) {
		mFormat = format;
		mTargetViewId = targetViewId;
	}

	public void setBinding(int from) {
		mTargetViewId = from;
	}

	public void setFormat(SimpleDateFormat format) {
		mFormat = format;
	}

	@Override
	public boolean setViewValue(View view, Cursor cursor, int columnIndex) {
		final int id = view.getId();
		if (id == mTargetViewId) {
			final String value = getLongFieldAsString(cursor, columnIndex);
			if (view instanceof TextView)
				setViewText((TextView)view, value);
			else
				throw new IllegalStateException(view.getClass().getName() + " is not a view that can be bound by this view binder (" +
						DateViewBinder.class.getSimpleName() + ")");
			return true;
		}

		return false;
	}

	private String getLongFieldAsString(Cursor cursor, int columnIndex) {
		final long data = cursor.getLong(columnIndex);
		if (data == 0)
			return "";
		else {
			final Date date = new Date(data);
			return mFormat.format(date);
		}
	}

	private void setViewText(TextView view, String value) {
		view.setText(value);
	}
}

And to use it in the code:

SimpleCursorAdapter adapter = new SimpleCursorAdapter(this, R.layout.schedule_entry, cursor, dataColumns, viewIDs);
/** A date format.  Example: 1:30 pm on Wednesday, Jan. 2 **/
final SimpleDateFormat format = new SimpleDateFormat("hh:mm a 'on' EEEE, MMM'.' d", Locale.ENGLISH);
final DateViewBinder binder = new DateViewBinder(format, R.id.dateTime);
adapter.setViewBinder(binder);

Now we’ve got something much more understandable for our users:

Date view in TeamCaptain using the custom ViewBinder

This can of course be further expanded to take in an array/list of view IDs for when more than one date and/or time field is going to be shown using a single adapter. Hope this helps!

Reading/Parsing a PDF File with C#

Last week, I had the pleasure of dealing with PDF files…for processing data. If you are [or potentially could be] doing EDI with a third party, please please PLEASE be able to provide something other than a PDF format; while PDF is a great format for human-reading or printing, using it for EDI is a no-no. If there is a standard format for the type of data you’re sending, that’s ideal. In this particular case, since the client was only sending data in PDF format, dealing with it was a necessity.

Looking at some sample PDF files, the data was actually text and not in the form of images (thankfully). I found a [seemingly] quick and simple write-up on parsing/reading a pdf file with c# and asp.net to text, which may be useful for many. In my particular case, though, our projects are signed–so in order to use the Apache PDFBox library mentioned there, I would need to sign it.

The Apache PDFBox library is open source, but written in Java and using both Apache Ant for automating the build process and IKVM.NET (IKVM.NET is an implementation of Java for Mono and the Microsoft .NET Framework). So, I decided to try just signing the assemblies without recompiling first.

I found Signer from the Alois Kraus blog for signing third party assemblies and decided to give it a try. From the project page:

How does it work?

Signer does basically a full round trip by decompiling the assembly into IL code make the necessary modifications and compile it back to a valid assembly. The required modifications include

  • Update of all references
  • Change/Removal of InternalsVisibleToAttribute
  • Update of custom attributes with a type parameter
  • A little fix to work around an ILDASM problem

Although running the assemblies through Signer claimed to be successful, the assemblies were left yet unsigned (after a failed attempt at loading them in my project, I used the corflags tool to verify that the ‘Signed’ flag was still not set).

I also tried following the instructions from a post on Ryan Farley’s blog covering how to Sign a .NET Assembly with a Strong Name Without Recompiling, but again without any luck (the resulting assembly appeared to have a corrupt header with an unrecognizable assembly name).

With some further digging, I found an issue ticket for PDFBox in the issue tracking system related to getting some signed assemblies, but also ran into issues. When a quick attempt at building the project manually and following the .NET version instructions also failed, I decided to take a different approach. More time probably would have resulted in a working outcome, but I didn’t feel the time and effort was worth it.

Using Process and the PDFBox Command-line Tools
In the end, I decided to use the Process class to send each file I was working with over to the ExtractText application included in the PDFBox Command Line Utilities. With some tweaked ProcessStartInfo for grabbing the output
and its StandardOutput property, I was able to get what I needed:

public static string ReadPDF(string inputFilePath)
{
	try
	{
		// first, build our ProcessStartInfo so we can grab the console output
		string arguments = "-console "" + inputFilePath + """;
		string pathToPDFBox = Properties.Settings.Default.PathToPDFBoxExtractText;
		if (string.IsNullOrEmpty(pathToPDFBox))
			throw new Exception("Unable to find PDFBox (the setting 'PathToPDFBoxExtractText' has not been set)");
		ProcessStartInfo startInfo = new ProcessStartInfo()
		{
			FileName = pathToPDFBox,
			Arguments = arguments,
			UseShellExecute = false,
			RedirectStandardOutput = true
		};
		// now launch the process and grab the result
		string output = string.Empty;
		using (Process pdfReaderProcess = new Process() { StartInfo = startInfo })
		{
			pdfReaderProcess.Start();
			output = pdfReaderProcess.StandardOutput.ReadToEnd();
			pdfReaderProcess.WaitForExit();
		}

		return output;
	}
	catch (Exception ex)
	{
		string message = string.Format("There was a problem reading the PDF file '{0}': {1}",
			inputFilePath, ex.Message);
		throw new Exception(message, ex);
	}
}

I was working with some smaller test files, so in order to just test it out I just put something together real quick. Really, you’d want to use the GOCR.

If any of you have had a related experience, I’d love to hear about it.

Implementing bulkInsert for a custom Android ContentProvider

One of our apps implements a custom ContentProvider for managing the data it uses. The insert method gets a writable instance of the database, matches against the Uri, makes sure the appropriate values are included, inserts the entry and returns the Uri of the new entry. This is fine for when the app finds a new entry or two to add, but this seemed a little wasteful for the first update (when a user first subscribes, all of the current existing entries are pulled and added at once). With a little searching I found a related StackOverflow post, where user jcwenger explains it nicely:

However, since each Operation in the Batch can have a different URI and so on, there’s a huge amount of overhead. “Oh, a new operation! I wonder what table it goes in… Here, I’ll insert a single row… Oh, a new operation! I wonder what table it goes in…” ad infinitium. Since most of the work of turning URIs into tables involves lots of string comparisons, it’s obviously very slow.

By contrast, bulkInsert applies a whole pile of values to the same table. It goes, “Bulk insert… find the table, okay, insert! insert! insert! insert! insert!” Much faster.

It will, of course, require your ContentResolver to implement bulkInsert efficiently. Most do, unless you wrote it yourself, in which case it will take a bit of coding.

But wait, requires your ContentResolver to implement bulkInsert efficiently? How does one do that? Sure enough, checking the source code and comments of the abstract ContentProvider class for the bulkInsert method verifies that we have to implement bulkInsert ourselves.

By default, SQLite creates a journal file for each transaction (which is deleted once the transaction completes). For the bulk insert, we should specify a single transaction. A single Uri is passed in, so we should also only check that once. Here is what my bulkInsert logic looks like:

@Override
public int bulkInsert(Uri uri, ContentValues[] allValues) {
	SQLiteDatabase db = mHelper.getWritableDatabase();
	switch (URI_MATCHER.match(uri)) {
	case URI_EVENTS:
		return insertEvents(db, allValues);
	case URI_NEWS_POSTS:
		return insertNewsPosts(db, allValues);
	// ...
	default:
		throw new IllegalArgumentException("Unknown URI " + uri);
	}
}

private int insertEvents(SQLiteDatabase db, ContentValues[] allValues) {
	int rowsAdded = 0;
	int rowId;
	ContentValues values;
	try {
		db.beginTransaction();
		
		for (ContentValues initialValues : allValues) {
			values = initialValues == null ? new ContentValues() : new ContentValues(initialValues);
			rowId = insertEvent(db, values);
			if (rowId > 0)
				rowsAdded++;
		}
		
		db.setTransactionSuccessful();
	} catch (SQLException ex) {
		Log.e(LOG_TAG, "There was a problem with the bulk insert: " + StringUtils.exceptionToString(ex));
	} finally {
		db.endTransaction();
	}
	
	return rowsAdded;
}

private int insertEvent(SQLiteDatabase db, ContentValues values) {
	if (!values.containsKey(TeamCaptainData.EventColumns.ORGANIZER))
		throw new IllegalArgumentException("Missing event column '" + TeamCaptainData.EventColumns.ORGANIZER + "'");
	// ...do some processing (check for more missing fields, set default values, etc.)
	return db.insert(TABLE_EVENTS, null, values);
}

Note that in this particular application, most bulkInsert calls will be outside of the SQLiteOpenHelper callback methods (onCreate, onUpgrade and onOpen); SQLite creates a transaction before invoking these methods, so when working within them it’s unnecessary to add explicit transaction control. Also, only one of the data columns can be empty (and empty is not the norm). A large batch of insertions is also usually only necessary when the user first subscribes to a particular organizer, which is not expected to happen often. If you’re looking at a different scenario or are just curious about possibly optimizing your bulkInsert implementation further, you might want to check out the DatabaseUtils.InsertHelper class. Out Of What Box? has a blog post, Android: Using DatabaseUtils.InsertHelper for faster insertions into SQLite database, that covers this in more detail.