Included are most of the Python code I used to obtain blog content, some of my attempts to automate the building of the network (I ended up using a manual process in the end), and my analysis. I also included the data. (You can probably see some of your own content.)
Here's what I learned/got reminded of the most:
- Doing projects like this is hard when you have other responsibilities, and you usually end up paring down your ambitions toward the end
- Data collection and curation was, as usual, the most difficult process
- Network analysis is fun, but I have a ways to go to know where to start first, what questions to ask, and so forth (these are the things you learn with experience)
- The measures that seem to be the most revealing are not always obvious -- in this network, it was the number of shortest paths compared to a random graph
- Andrew Gelman's blog is central (but you probably don't need a formal analysis to tell you that)
- There's a lot of great content about statistics, data analysis, data science, and statistical computing out there. I've relied on blog posts for a lot of my work, and I've found even more great stuff. It's a firehose of information.