Answered By: Carl Last Updated: Jul 22, 2025 Views: 0
Answered By: Carl
Last Updated: Jul 22, 2025 Views: 0
Yes. We can walk you through key criteria and frameworks to assess datasets based on scholarly research and best practices. Here are the main dimensions and their elements to consider:
- Availability
- Accessibility - How easy is it to access this dataset? Is it open to the public, or do you need special permissions, credentials, or payment to obtain it?
- Timeliness - Was the data collected and made available in a time frame that still makes it relevant for your research? Is it updated regularly enough for your needs?
- Usability
- Credibility - Is the dataset produced by a trustworthy and reputable source? Are the methods and context of its production documented clearly, and does it seem reliable given the time and conditions it was created? Have the data been normalized—that is, adjusted or standardized to remove inconsistencies, units mismatches, or irregular formats—to make them coherent and comparable?
- Definition/Documentation - Does the dataset include clear definitions of its variables, valid value ranges, formats, and any rules or standards applied to it?
- Matadata - Does the dataset come with metadata (i.e., a README file, a data dictionary/codebook, or a qualitative codebook) that describe its content, structure, origin, and meaning to avoid misunderstandings or misinterpretation?
- Reliability
- Accuracy - Does the data accurately reflect the real-world phenomena it is supposed to represent? Is there a known reference or benchmark to check its correctness in this context?
- Consistency - Are similar data points stored or represented in the same way throughout the dataset? If the same information appears in multiple places, do they agree with each other?
- Integrity - Does the dataset have a complete and logical structure? Are its internal relationships, rules, and definitions maintained properly without unauthorized changes?
- Completeness - Are all expected components and variables of the data present and valid? Are there missing parts that would impair its use or accuracy?
- Auditability - Could an auditor or reviewer independently verify the data’s accuracy and integrity without excessive time or effort?
- Relevance
- Fitness - Does this dataset cover the aspects of your research topic adequately? Are its elements, indicators, and classifications aligned with your research questions and needs?
- Presentation Quality
- Readability - Is the dataset presented in a way that is easy to interpret, using standard terms, codes, units, and conventions? Are its descriptions clear and understandable?
- Structure - Is the dataset well-organized and structured in a way that makes it easy to analyze? If it contains semi-structured or unstructured data, how difficult would it be to transform it into a structured format?
Contact the Digital Scholarship and Publishing team to ask a question, set up a consultation, or learn more about the library's research data support services.
References:
- Cai, L., & Zhu, Y. (2015). The challenges of data quality and data quality assessment in the big data era. Data science journal, 14, 2-2. https://doi.org/10.5334/dsj-2015-002
Was this helpful? 0 0
Comments (0)
Related Topics
Need help? Chat with us!
More ways to contact us
Please note:
- Ask Us is intended for current Pitt students, faculty, and staff or questions from others regarding our unique resources and services
- We cannot respond to medical, legal, or tax-related questions
- Use of Ask Us is your agreement to our Privacy Policy Statement
- Anonymized transcripts may be used to improve our service or for training