Download Connect 2014 presentation files
The show is over and the annual question arises: how do I download all the presentations? To do that, you will need a valid username and password for the Connect 2014 site, no anonymous access here. The 2014 site is build on IBM Portal and IBM Connections. IBM Connections has a ATOM REST API, that opens interesting possibilities. With a few steps you can get hands on all files. I will use CURL to do this.
The XSLT used:
Update: Stevan Bajić tuned the XSLT to check for the existence of the file with the right size. This makes it a Linux native script. Enjoy! The following script is © 2014 Stevan Bajić
- Create or edit your .netrc file to add your Connect 2014 credentials (in one line)
machine connections.connect2014.com login [YourNumericID] password [YourNumericPassword]
(Note [ and ] are NOT part of the line in the .netrc file) - Download the feed. Checking this morning, I found a little more than 500 files. The Connections API allows for max 500 entries per "page", so 2 calls will be sufficient for now. You can check the number of files in the
<snx:rank>
element in the resulting XML:curl --netrc -G --basic -L 'https://connections.connect2014.com/files/basic/anonymous/api/documents/feed?sK=created&sO=dsc&visibility=public&page=1&ps=500' > page1.xml
curl --netrc -G --basic -L 'https://connections.connect2014.com/files/basic/anonymous/api/documents/feed?sK=created&sO=dsc&visibility=public&page=2&ps=500' > page2.xml
(explanation of parameters below) - Transformt the resulting files to a shell script using XSLT (see below)
java -cp saxon9he.jar net.sf.saxon.Transform -t -s:page1.xml -xsl:connect2014.xslt -o:page1.sh
- Make the scripts executable (unless your OS would execute arbitrary files)
chmod +x page1.sh
- Run the download
./page1.sh
- the CURL parameters
- --netrc: pull the user name and password from the .netrc file
- -G: perform a GET operation
- --basic: use basic authentication
- -L: follow redirects (probably not needed here)
- (optional) -v: verbose output
- the Connections Files API parameters
- sK=created: sort by creation date
- sO=dsc: sort decending
- visibility=public: show all public files
- page=1|2: what page to show. Start depends on page size
- ps=500: Show 500 files per page (that's the maximum Connections supports
The XSLT used:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:snx="http://www.ibm.com/xmlns/prod/sn"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="no" method="text" />
<xsl:template match="/">#!/bin/bash
# Entries in this feed <xsl:value-of select="atom:feed/snx:rank" />
echo "Starting downloads"
<xsl:apply-templates select="atom:feed/atom:entry/atom:link[@rel='enclosure']" />
</xsl:template>
<xsl:template match="atom:link">curl --netrc -G --basic -C - -L "<xsl:value-of select="@href"></xsl:value-of>" -o "<xsl:value-of select="@title" />"
</xsl:template>
</xsl:stylesheet>
Of course, you simply could scout Slideshare.
Update: Stevan Bajić tuned the XSLT to check for the existence of the file with the right size. This makes it a Linux native script. Enjoy! The following script is © 2014 Stevan Bajić
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:snx="http://www.ibm.com/xmlns/prod/sn" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="no" method="text" />
<xsl:template match="/">#!/bin/bash
# Entries in this feed <xsl:value-of select="atom:feed/snx:rank" />
echo "Starting downloads"
<xsl:apply-templates select="atom:feed/atom:entry/atom:link[@rel='enclosure']" />
</xsl:template>
<xsl:template match="atom:link">[ "$(stat "./files/<xsl:value-of select="@title" />" 2>&1 | sed -n "s:^[\t ]*Size\:[\t ]*\([0-9]*\)[\t ].*:\1:gp")" != "<xsl:value-of select="@length" />" ] && curl --netrc -G --basic -L "<xsl:value-of select="@href"></xsl:value-of>" -o "./files/<xsl:value-of select="@title" />"
</xsl:template>
</xsl:stylesheet>
Posted by Stephan H Wissel on 04 February 2014 | Comments (a) | categories: IBM