When you use someone else’s data in your research, it’s essential to understand the licensing terms and reuse restrictions. Data, even if freely available online, is not automatically free of restrictions. Misuse can lead to copyright violations, ethical concerns, or even retraction of published work.
Here are some things you should do:
- Check the license or terms of use. Most datasets include a license or a terms-of-use statement specifying what you’re allowed to do.
- Consider ethical and legal obligations such as privacy and confidentiality, Institutional Review Board (IRB) requirements, and data export controls and embargoes.
- Properly attribute the source. Even for open data, you need to cite the data source properly, just as you would with any scholarly resource. Many repositories and datasets provide a preferred citation format.
- Be cautious with scraped or aggregated data. If you’re scraping data from websites or aggregating data from multiple sources check the site’s robots.txt file and terms of service. Also, be aware that scraping can still violate copyrights or contracts, even if the data is publicly visible.
Check out the following references for more information.
Contact the Digital Scholarship and Publishing team to ask a question, set up a consultation, or learn more about the library's research data support services.