[Devember2021] - Image caption manager

inakleinbottle · October 31, 2021, 10:46am

Greetings, I’m a first time Devemberer and generally clueless about developing a complete software project. My project is a caption manager for images, whereby a user can add an image and tag it with a caption (or longer description) for future reference.

Context

The idea for this project came from my mother. She wants something to help her keep track of photos and other images that she has and attach notes to those images to record details and notes about particular images. I’m sure there are such software packages out there, but since I was eager to find a project for Devember this seemed like a perfect idea. Hopefully, the added incentive to actually produce a complete project in Devember will be sufficient motivation to actually get this done.

Final product specification

This is going to be a GUI application, probably in Qt since it needs to be cross platform and relatively easy to write. It’s going to need some kind of database backend, probably a simple sqlite database, that links image URIs (file paths, and possibly URLs) to the text associated with it. If I have time, I might also incorporate a tagging system.

The GUI will be very simple (by necessity), with a list of images on one side, a preview window, and a text box in which a description can be typed. I think this is about the limit of my abilities, having never done any GUI programming before. There needs to be buttons for adding new images, removing images, and exporting the image along with text into some format. (I don’t yet know how this export functionality will work.)

Learning

I think it is important to learn something in every project one works on so, with this in mind, this is a list of things I think I will learn in this project:

how to write a GUI application;
working with relational databases and queries;
retrieving and previewing images inside the gui; and
exporting to some file format (like PDF or HTML).

Details

I’ve created a Github repository for this project, which is blank at the time of writing (link). Apart from that, I don’t know if there is anything else Devembery that I need to do; please let me know.

Happy coding!

vivante · November 1, 2021, 7:19am

This might be a problem down the line, what if you move the image to another folder trough a different app (say Nautilus or Explorer), or rename it? Maybe tap into image metadata if possible?

pakxo · November 1, 2021, 7:51am

When I first read this post, I thought that there should also be an auto-tagging feature using machine-learning or something as well, but it’s fine you can add those in the future.

inakleinbottle · November 4, 2021, 7:31pm

@vivante I hadn’t thought about this, but I really don’t want to have the images copied either into the database or somewhere else on the filesystem. A simple solution is to just alert the user when the program can’t find the images. I will be inspecting the image metadata to some extent, but I don’t really know what kind of things I will be able to reliably get from this

@pakxo Sorry to disappoint, but I probably won’t have time to implement an autotagging ML model. I’m sure this isn’t very hard to do - I do know people in the image classification world - but the overhead of bringing in a heavy-weight library like tensorflow or torch doesn’t seem worth it. I’m already using Qt for the gui, so the packaging is going to be “fun”. (This is for my mother, who uses a Windows laptop, after all.) Maybe if I finish faster than I expect, I can look into this, and I (of course) welcome any suggestions to help along the way.

vivante · November 4, 2021, 7:57pm

I just meant maybe you could use metadata to write a UUID into the image itself. Later you can ID it if the filesystem path changes and it pops up again. You wouldn’t have to treat it as a new unknown photo, just update the path in the DB.

inakleinbottle · November 4, 2021, 8:45pm

Oh right. That’s a really cool idea! I’ll look into this.

inakleinbottle · November 16, 2021, 9:24pm

Right, I think it’s time for an update.

I’ve spent the past two weeks putting together the user interface and the database back-end. I’m using a simple SQLite database, using the Qt database driver, and mapping the columns to the various widgets on the GUI using a relational mapper. The Qt framework makes this integration fairly simple for text fields, but displaying a preview of the image via the path stored in the database proved rather difficult. I had to write a custom “item delegate” to intercept the loading process and load the image. Sadly, the Qt documentation is rather unhelpful in this area.

I am not very good at designing interfaces, so what I’ve got is very simplistic. I might spend some more time on it towards the end if I have time. At the moment, it looks like this.

The left hand side box is a list view that displays the titles for all of the images stored in the database. On the right hand side, the details of the currently selected item are displayed in the boxes. At the top is a box for the title and a calendar widget for the data/time associated with the image - I haven’t worked on this yet. Directly below (outlined) is a preview box for the image itself. Below the preview is a box for adding the caption text. Finally, at the very bottom is a button to save the current record.
Finally, there are buttons to filter the list of records and to add records.

At the moment, I have written code for the save function to commit changes and additions back to the database and the basic code for adding new images. This brings up a file select dialog for the user to select a new image, but as of yet this is not connected to the rest of the logic. I’ve been working on the flow that follows the selection of a file but haven’t figured out exactly how to set the list position.

I’m happy with the progress I’ve made so far, and I think I will have time to implement a tagging system for the images. The database should make it possible to link the images table to a separate tag table, but I don’t know yet what form this table should take; I welcome any suggestions from anyone who has ideas.

I think that’s enough. Until next time .

inakleinbottle · December 12, 2021, 9:04pm

Happy half-way through Devember (almost, kinda). Time for another update!

It has been a frustrating couple of weeks working on the tagging system. However, I do now have something that works. Let’s start from the top.

The tagging system is implemented using a pair of tables - in addition to the “images” table that I already had - in the SQLite database. These are tags, which holds the mapping from tag id to tag name, and image_tags, which contains a mapping from image id to tag id. I am vaguely aware that there are other methods for implementing tagging systems, but I wanted something simple that works. The tables I have added look like this:

image_tags                           tags
 id | imageID |tagID                  id | name                  
----+---------+------                ----+-------
  1 |       1 |    1                   1 | "tag1"
  2 |       1 |    2                   2 | "tag2"

(I really hope my formatting doesn’t get messed up here.) The sample data in these tables indicates that the image with ID 1 is tagged with the tags “tag1” and “tag2”, with tagIDs 1 and 2, respectively.

Now that I have a working table structure I need a way of adding new tags to an image. I’ve changed the layout of the main window slightly here to add a new button for editing tags:

I have one image in my list (at ID 1) - indeed this is the screenshot from my previous post :D. Below the caption editor box there are now two buttons, the “Edit tags” button and the old save button. Clicking the edit tags button brings up a new dialog:

caption-manager-tag-editor-12-12-21

At the top is a text box for writing the name of a new tag to associate with the current image (the one selected in the main window). Below this is a list view that lists all of the tags that are currently associated with the list. This is using a Qt relational model, which was the source of endless pain when trying to get this to work. To the right are three buttons for adding a new tag (using the text from the text box as the name), removing the selected tag, and reverting changes. At the bottom are buttons for canceling current changes or accepting (OK) changes. Both of these buttons close the dialog.

The text box has a completion based on tags that are already in the tags table - including those that are not linked to the current image (otherwise it would be pretty useless) - and all changes are local to the dialog until the OK button is clicked to accept the changes, the changes are reverted using the revert button, or the whole editing session is cancelled (and changes reverted). At present, there isn’t a way to modify tags in-place, but I’m not sure there should be.

The problems I encountered were trying to get the QSqlRelationalTableModel to update the correct tables when adding tags. Indeed, the relational table establishes a correlation between the tagID column of the image_tags table, and the name (via id) of the tags table. I assumed, seemingly wrongly, that adding a new name associated with the current image ID through the model would also perform the insert into the tags table. Unfortunately, neither case seemed to work properly. When the tag name didn’t already exist, the model failed to insert a new entry into tags. When the tag did exist, the relation of the model would not allow me to insert a tagID, imageID pair into the image_tags table. Moreover, when I tried to insert an imageID, name pair, it didn’t associate the tagID with the requested name. (Nightmare!)

In the end I gave up trying to use the model to perform the inserts and instead wrote the SQL code by hand which is ugly, but does what I need it to. I asked on StackOverflow for a more idiomatic solution, but as of yet I’ve had no responses. I did worry that not using the model would not allow me to make use of the transactional nature of models, but I managed to get around this by using the transaction function from the database handle itself.

I still have a fair amount of work to do. I haven’t yet implemented rendering PDFs or any other export functionality. I also need to implement the filtering function for the main window, which will also involve some SQL madness and creating a new dialog box. However, I am feeling quite a lot more comfortable with the Qt framework. I also need to figure out how to build a Windows installer for this eventually, in order to make the process of installing it simpler for my mother.

Here is the GitHub link for the code at this point in time: GitHub - inakleinbottle/caption_manager at 44efc2954bf76664f1444e8ccac2feb874baf0c9

I would appreciate any feedback or advice about how to proceed from knowledgeable and/or opinionated members.

Hammerhead_Corvette · December 14, 2021, 11:10pm

How are you handling possible duplicate images? Are you going to implement a hashing function to give each image a unique hash ( sha2 ) and list it in the database ? Also, is there information like ( Date created, Date modified, type ) like you would find in the properties of a image?

inakleinbottle · December 16, 2021, 9:46pm

@Hammerhead_Corvette My original plan was to include some kind of hash digest to uniquely identify images to prevent duplication, but I haven’t implemented this as of yet. At the moment, I only read the file metadata to grab the creation time to populate the database field. In practice, I can grab any of the metadata and store that along with the path but I haven’t really thought about this yet.

I’ll probably use something like SHA2, but I’m wary of introducing heavy library dependencies so we’ll have to see.

inakleinbottle · December 28, 2021, 6:18pm

Time for yet another update, very close to the end now!

So I’ve solved the issues I was having with tags and I’ve finished implementing a filtering button for the main window to show only those images in the image list that are tagged with selected tags. The first click of the filter button brings up a new dialog for the user to select tags. Then on completion the filter will be applied and the button will change colour and function to remove the current filter. I thought this was a fairly simple solution to an otherwise tricky problem of managing active filters. In terms of the underlying model, this is just constructing a SQL statement that becomes the WHERE clause of the model’s select statement. Undoing the filter is simply setting this clause to an empty string. Thankfully that part was quite easy to figure out.

I’ve left open the possibility of extending the filtering system somewhat but, because of time, I haven’t put this in the current version. Basically the mechanisms are in place to allow specifying multiple filtering statements like title contains or caption contains etc. I was careful to build in the machinery that would allow for the construction of more complex WHERE clauses, but constructing other components of this will be rather tricky. I will come back to this at some future point, but at least I have something that works now.

The second thing I have worked on is the export system. I originally thought that I would have to bring in some external PDF document library to manage this, but it turns out that Qt has a handy PDF writer class that works like a print device that one can use to print to PDF. I combined this with the Qt Document class and, after a lot of fiddling around, managed to make a working export system. Now, with the desired image selected in the main list, one can use File > Export to generate a PDF file that contains the title, image, and the associated caption. This is a little sparse, and ideally I would have liked to be able to export multiple images into a single PDF but that would require setting up new dialogs and the background logic to handle pagination etc. for the final document. Given how long it took me to figure out how to correctly resize the images so they fit on the page, I don’t think I will have time for this.

As @Hammerhead_Corvette and I discussed above, I’ve added a hash value to the database to ensure the uniqueness of an image in the database. This is nothing particularly special: when a new image is added to the database, the file contents are hashed using SHA256 (which is probably overkill, since it doesn’t really need to be that cryptographically strong), and the digest is stored in the database. This has a uniqueness constraint so each image can exist only once in the database. Fortunately this didn’t need any additional libraries since Qt has a cryptographic hash module.

I still have a couple of things left to do. Most of these are portability related, such as selecting the proper place to put the SQLite database in which all the data is stored. Currently this just sits in my build directory, but this obviously isn’t sensible for an end-user. I also need to to manage some configuration options to tune the appearance, and make sure that there are backup icons in case this does not translate well over to Windows. Finally, I need to put together some kind of installer/packaging system that will allow my mother to install the program on her PC. I have no doubt that the latter will prove quite frustrating.

The up-to-date code for this post is on GitHub with the following reference: GitHub - inakleinbottle/caption_manager at bd2fe1d33a03d1892229675e54b9ff90ac5ac324.

Thanks all for following along with this project

inakleinbottle · December 31, 2021, 7:11pm

Happy new year’s eve Everybody!

Final update time.

I spent an unreasonable amount of time last night compiling the program for Windows using an Azure virtual machine and no small amount of swearing as I waited for vcpkg to install all of the many many libraries required by Qt. Fortunately this only took a couple of hours. Finally I had a collection of binaries.

I tried to package this up into a nice installer that I could host somewhere, but sadly I could not get the Qt Installer Framework to work (or even figure out how to download it). The documentation was less than helpful. Instead I took all the binaries from the build directory and zipped them up. I’ve attached this artifact to the GitHub repository under the Release section. My mother tested it this afternoon and it mostly works as intended, apart from a small amount of faffing about to put all the binaries in the right place. Please feel free to download and try it for yourself. (On Linux you can probably compile it for yourself, but if you don’t know how or can’t for some other reason, please let me know and I’ll see what I can do.)

Overall I think this has mostly been a success. There are some parts which aren’t as good as I’d hoped or planned, but I did manage to implement everything that I planned in one form or another. The only real let-down was the installer.

Happy coding Everyone.