This site has a linkblog and I thought I'd do a quick writeup on how I capture the links and their metadata.  You'll notice the the links are displayed in cards, similar to what you see on social media sites such as Facebook and Twitter.

This writeup will cover:

  1. The conten type to store a link and its metadata,
  2. Creating a bookmarklet so you can easily add a story to your site as you surf the web.
  3. Scraping metadata on a webpage to get the image, site name, title and description.

This writeup assumes that you have some basic understanding of Drupal on a site builder level.  I'm assuming you understand basic administration tasks such as creating content types, and fields as well as how to create a module,.  

Although I wrote the code for Drupal 9, as I review it, I see no reason that it won't work for Drupal 8.   Since support is ending for Drupal 8, you should be upgrading to Drupal 9, but that's a different matter.

Creating The Content Type

You'll need a content type to house the links.  On this site I'm using my generic note type which I use for most of my blog posts (this allows me to add a link to any post).  But I assume you want to use a separate content type, let's create a content type named link.  In addition to the standard title and body fields you want to give it the following fields::

Label

Field Name

Field Type

Link

field_link

Link

OG Link Description

field_og_link_description

Text (plain, long)

OG Link Image Url

field_og_link_image_url

Text (plain, long)

OG Link Site Name

field_og_link_site_name

Text (plain)

OG Link Title

field_og_link_title

Text (plain)

OG Link URL

field_og_link_url

Text (plain, long

HTTP Status Code

field_http_status_code

Text (plain, long)

 

Of course you can just add these fields to an existing content type as I did, you'll just need to adjust the code as you go forward.

Building the Bookmarklet

Simply put, a bookmarklet is a browser bookmark that contains JavaScript.  We're going to create one that will open a node add form with the url of the current page already prepopulated into the link field.  This saves you the effort of copying the current URL, opening your site, navigating to the node add form and pasting in the url of the page you want to blog.  There are 2 things we need ti do to make this work:

First we'll  get Drupal to accept a parameter on the node add form's URL and prepopulate the link field.  We either need to create a new module or use a module that you already use for glue code and use hook_form_alter.


This code basically says, "when loading the node add page for the link content type, look to see if there is a 'link' query string, and if there is, put the contents of the query string into field_link."

Next we need to get the query string into the URL .... that's where the bookmarket comes into play.  Here's a little javascript


You need to replace "example.com" with your site's URL.  Just add a bookmark in your browser, call it something like "Add Link To My Site" and paste the javascript in as the link.  Add the bookmark to the favorites bar and when you're on a page that you want to blog about, click on the button, add any commentary in the body field and rock and roll.

There is a contributed module, Prepopulate which accomplishes the sane thing (and more) but is a little more overhead than the couple of lines of code I wrote here.  Plus, if we use contrib for the easy things, we'll never learn anything.

Fetching Metadata

Next we need to fetch the image url, site name, title and description.   You can either scrape the content for metadata server side at save or client side at when rendering the page.  I prefer doing it at save since doing it  doing it at client side will slow down page loads.  Of course, since you're caching the information, if the site changes any of the metadata, your site will be out of date.

Instead of writing code to parse out the metadata, I took advantage of opengraph.php, a library that does the heavy lifting,  Very simply, I used hook_ENTITY_TYPE_presave to populate the appropriate fields.  You can put this in the same module from above:


This loads the open graph library, loads the page info a variable and pass the page to the library to find the  metadata and then add it to the node before it's saved.