{"id":175,"date":"2021-05-12T09:53:09","date_gmt":"2021-05-12T09:53:09","guid":{"rendered":"http:\/\/ds-wordpress.haverford.edu\/lacol-dh\/?page_id=175"},"modified":"2021-06-28T16:02:12","modified_gmt":"2021-06-28T16:02:12","slug":"week-3-working-with-data","status":"publish","type":"page","link":"http:\/\/ds-wordpress.haverford.edu\/lacol-dh\/labs\/week-3-working-with-data\/","title":{"rendered":"Week 3 &#8211; Working with Data"},"content":{"rendered":"\n<p>Now that we\u2019ve learned a bit more about how humanities data is created, collected, and structured, we\u2019re going to explore some tools for making sense of large datasets and for \u201ctidying\u201d them up. We\u2019ll try out two tools that are handy for data-driven DH projects: <a href=\"https:\/\/databasic.io\/en\/wtfcsv\/\">WTFcsv<\/a> and <a href=\"https:\/\/openrefine.org\/\">OpenRefine<\/a>. The latter requires a software install. Please see the documentation for installing OpenRefine in <a href=\"https:\/\/docs.google.com\/document\/d\/1UPnZfNd-sU8fg8WgTAoHw3jb3IU4GufPzW9Auwh6AEw\/edit?usp=sharing\" target=\"_blank\" rel=\"noreferrer noopener\">this tutorial<\/a> and message me on Slack, visit me during office hours, or contact one of the course TAs for help if you need it.<\/p>\n\n\n\n<p>In this lab we will:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Gain familiarity with a couple tools for making sense of and manipulating data<\/li><li>Follow a workshop video produced by Haverford Libraries to learn these tools with sample data<\/li><li>Examine a CSV (comma-separated value text file) export of the Omeka item metadata from last week\u2019s lab and identify one point of interest in the collection metadata<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><em>Specs<\/em><\/strong><\/h3>\n\n\n\n<ul class=\"wp-block-list\"><li>Approximately 750 words<\/li><li>Report applies ideas from this week\u2019s readings to the questions<\/li><li>Author offers meaningful research questions for the datasets&nbsp;<\/li><li>Author makes their points in clear and concise ways<\/li><li>The work contains no more than 3 grammatical, spelling, or other \u201cmechanical\u201d errors.<\/li><li>The work contains no more than 2 minor factual inaccuracies and no major factual inaccuracies.<\/li><li>Upload PDF to Canvas<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Lab Instructions<\/strong><\/h3>\n\n\n\n<ol class=\"wp-block-list\"><li>Download this CSV of song data from the <a rel=\"noreferrer noopener\" href=\"https:\/\/haverford.box.com\/s\/vmfek15mwqs4ysrn18enkipw8ahg8yz1\" target=\"_blank\">Free Music Archive<\/a><\/li><li>Watch and follow <a rel=\"noreferrer noopener\" href=\"https:\/\/youtu.be\/JkztIzvdcL8\" target=\"_blank\">this video<\/a> of a workshop run by Haverford College Libraries in the summer of 2020, referring to the <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.google.com\/document\/d\/1UPnZfNd-sU8fg8WgTAoHw3jb3IU4GufPzW9Auwh6AEw\/edit?usp=sharing\" target=\"_blank\">tutorial<\/a> that includes step-by-step instructions as well as links to sample data for the exercises within. A good place to start is around the 8:30 mark (everything before that is just idle chit-chat and waiting for the workshop to start), and the OpenRefine portion ends at the 1:03 mark. You may continue on past the OpenRefine section, as the video introduces another tool called Palladio that is also useful for DH projects, but it is not required!<\/li><li>Follow the \u201cExercises\u201d section in the <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.google.com\/document\/d\/1UPnZfNd-sU8fg8WgTAoHw3jb3IU4GufPzW9Auwh6AEw\/edit?usp=sharing\" target=\"_blank\">OpenRefine tutorial<\/a> to get more familiar with the tool.<\/li><li>Download <a rel=\"noreferrer noopener\" href=\"https:\/\/haverford.box.com\/s\/54ne271rmbn9lb9xx831ly8ed0b3ytty\" target=\"_blank\">this CSV of item metadata<\/a> from last week\u2019s Omeka exercise.<\/li><li>Create a new project in OpenRefine and import the Omeka export data, and use the tools and strategies you practiced in the exercises to manipulate the data.<\/li><li>Finally, go to <a rel=\"noreferrer noopener\" href=\"https:\/\/databasic.io\/en\/wtfcsv\/\" target=\"_blank\">https:\/\/databasic.io\/en\/wtfcsv\/<\/a> to try out a tool called WTFcsv. In the box that says \u201cuse a sample\u201d or \u201cupload a file,\u201d choose the latter and upload the Omeka collection CSV. What (if any) conclusions can you draw from the result?\u00a0<\/li><\/ol>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Report (about 750 words)<\/strong><\/h3>\n\n\n\n<p>Please address these prompts in your report:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Describe your experience working with OpenRefine. Were you able to install it successfully? Were you able to create a new project and import the data? Were you able to complete all the exercises? What problems did you encounter?<\/li><li>Describe at least one notable observation of the Omeka export data&nbsp;after working with it in WTFcsv or OpenRefine. If nothing is apparent, imagine one data-driven question you might ask of one of the featured collections from Week 2.<\/li><li>What sorts of possibilities do large datasets and tools for manipulating them create? What kind of research could you imagine while working with humanities data?<\/li><li>What kinds of research questions or projects could you imagine using the collections you engaged with last week and the data refining tools you worked with this week? Give at least one example of a project idea.<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Submission Details<\/strong><\/h3>\n\n\n\n<ul class=\"wp-block-list\"><li>Submit the lab report as a PDF to Canvas by the end of the day (local time) on Saturday, July 3<\/li><li>You can write the report in Google Docs, Word, Pages, or another application. Just be sure to save as a PDF<\/li><\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Now that we\u2019ve learned a bit more about how humanities data is created, collected, and structured, we\u2019re going to explore some tools for making sense of large datasets and for \u201ctidying\u201d them up. We\u2019ll try out two tools that are&hellip; <a href=\"http:\/\/ds-wordpress.haverford.edu\/lacol-dh\/labs\/week-3-working-with-data\/\" class=\"more-link\">Continue Reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"parent":167,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"page-template-full.php","meta":{"footnotes":""},"class_list":["post-175","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"http:\/\/ds-wordpress.haverford.edu\/lacol-dh\/wp-json\/wp\/v2\/pages\/175","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/ds-wordpress.haverford.edu\/lacol-dh\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/ds-wordpress.haverford.edu\/lacol-dh\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/ds-wordpress.haverford.edu\/lacol-dh\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/ds-wordpress.haverford.edu\/lacol-dh\/wp-json\/wp\/v2\/comments?post=175"}],"version-history":[{"count":13,"href":"http:\/\/ds-wordpress.haverford.edu\/lacol-dh\/wp-json\/wp\/v2\/pages\/175\/revisions"}],"predecessor-version":[{"id":1161,"href":"http:\/\/ds-wordpress.haverford.edu\/lacol-dh\/wp-json\/wp\/v2\/pages\/175\/revisions\/1161"}],"up":[{"embeddable":true,"href":"http:\/\/ds-wordpress.haverford.edu\/lacol-dh\/wp-json\/wp\/v2\/pages\/167"}],"wp:attachment":[{"href":"http:\/\/ds-wordpress.haverford.edu\/lacol-dh\/wp-json\/wp\/v2\/media?parent=175"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}