Categories
biology articles code science

EvoDevo Papers, refactored

EvoDevo Papers is a bot that shares research articles in the field of evolutionary developmental biology (or evo-devo). Originally, I created it as a Twitter bot and everything was going well for years… until Twitter’s takeover. The new policies and API restrictions made it unsustainable to continue posting there. In response to that, I decided to transition EvoDevo Papers to Mastodon.

In December 2022, I set up a pipeline using the feed2toot library to consume RSS feeds from academic journals and post the articles on Mastodon. Although I wasn’t very satisfied with it, it worked fine for about a year. However, after my server’s Python version was updated, feed2toot stopped working which caused EvoDevo Papers to break. There was no simple workaround, and no solution for the issue.

Because of this, I took the opportunity to completely restructure the pipeline and lay the foundation for the future of EvoDevo Papers.

You can check out the results in the links below, or continue reading to learn more details about the pipeline.

Django-based application

To develop the application, I opted for a framework I’m familiar with—Django. Since I used it to develop the Cifonauta database, it feels natural to use it again.

My goal was to create a minimal working EvoDevo Papers app that can post the most recent papers from relevant journals. Here’s the workflow I followed:

  1. Retrieve the journal’s RSS feed.
  2. Collect new articles from the feed.
  3. Generate the text for each post.
  4. Publish the posts to Mastodon.

Consuming feeds with django-feed-reader

I was confident that there was a Django package available for reading, and I was right. The first I came across was django-feed-reader. And it’s funny how sometimes I can immediately tell if a library or package will be a good fit for me. In the case of feed2toot, I knew it wasn’t. There’s something about the organization of the code, how options are handled, and other small details that didn’t fit well with me. However, with django-feed-reader, it was a great match!

The package provides all the basic functionality I need for reading feeds, as well as some useful additional features. It’s straightforward to add new feeds, check for feed updates, and import new entries. It also keeps tracks of the update frequency based on the activity of the feed. Additionally, there’s a management command available to refresh the feeds, which is incredibly helpful for my specific use case, as I’ll explain below.

Creating posts using a custom package

With the Sources (journals’ feeds) and Articles (feeds’ entries) being handled by django-feed-reader, I only needed to use this information to create a Mastodon post.

What I find useful is simply the article’s title, the article’s link with a couple of hashtags:

Developmental and genomic insight into the origin of the tardigrade body plan https://doi.org/10.1111/ede.12457 #EvoDevo #Papers

Example from here.

To process this information, I wrote a custom Papers app that takes the details from new articles and creates the Mastodon posts. This is also conveniently controlled by a custom Django management command for automation named createstatuses.py.

Posting to Mastodon using Mastodon.py

To handle the communication with Mastodon, I used what seems to be the only feature-full Python library for the Mastodon API, Mastodon.py.

I created a simple Clients app with a model for storing the client credentials and access tokens for the current Mastodon account of EvoDevo Papers. This will allow me to add and manage new clients for other social networks in the future if needed.

With all of these set, EvoDevo Papers needs to initiate the conversation with Mastodon’s API to publish the post. There’s a management command named publishstatus.py that takes the oldest unpublished article in the database, and publishes it to the @evodevo_papers@botsin.space account.

The cool thing is that after publishing a post, I get and store Mastodon’s JSON response with the boosts, likes, and replies. This allows for using this information on the website and maybe even making the bot interact with the Fediverse.

Building a static website with django-distill

Yes, another thing I wanted is EvoDevo Papers to have its website. But due to a limitation with my server, I can’t host a fully fledged Django app on it. I can, however, run Django on the backend.

Therefore, I decided to try running the Django application not only for updating the feeds and creating/publishing posts, but also to generate a static website with these contents which I could serve as simple HTML pages.

I searched for Django packages for building static websites… and found one, django-distill. It looked practical, and it was. Setting up and compiling simply worked. Great!

Designing the frontend with Simple.css

I’m a fan of minimal CSS frameworks, such as Skeleton. But I researched a bit more to see what else was out there. I was looking for something simple following HTML standards that would work out of the box. I found Simple.css.

For the home page, I put a list of the latest articles and of the feed sources. Some descriptions. Not much else. It looks fine for now!

EvoDevo Papers

Defining posting frequency with cron jobs

With all the components set, the app needs to run periodically to:

  1. Check feeds for updates
  2. Generate and publish posts
  3. Build static website

For that, I set a cron job that runs all the management commands once a day. As there aren’t many EvoDevo articles being published every day, the posts will spread across multiple days. It’s also good practice to not post 20 articles at once, something I had no control over when I was using feed2toot. But I’ll fine tune this as it goes.

Exploring ideas for the future

This short burst of full-stack development got me thinking about other cool things I could do with this type of scholarly data.

The bot could, for example, extract keywords from the articles and use them as hashtags. Discover and post related articles sporadically. Interact with people that like or boost it on Mastodon.

Going a bit further, I could fetch articles directly from OpenAlex or CrossRef. In this way, I could get even more data and metadata and extract the current hot topics, authors, or organisms, similar to what I already did with Living Bibliography.

This would be super informative to reveal the broader landscape of the EvoDevo field of research.

Reply by Email

or

Leave a Comment

Your email address will not be published. Required fields are marked *