Ethics Review Questions

Please note that this is not an exhaustive list of types of ethical issues that can arise, but rather a set of questions meant to help researchers think through possible adverse impacts. Reviewers who spot concerns unrelated to the questions in this list are encouraged to note them in the review form as well.

For papers presenting new datasets:

Does the paper describe how intellectual property (copyright, etc) was respected in the data collection process?
Does the paper describe how participants' privacy rights were respected in the data collection process?
Does the paper describe that crowd workers or other annotators were treated fairly? This includes, but is not limited to, compensating them fairly and ensuring that they were able to give informed consent, which includes, but is not limited to, ensuring that they were voluntary participants who were aware of any risks of harm associated with their participation.
Does the paper indicate that the data collection process was subjected to any necessary review by an appropriate review board?

For papers presenting new datasets AND papers presenting experiments on existing datasets:

Does the paper describe the characteristics of the dataset in enough detail for a reader to understand which speaker populations the technology could be expected to work for?
Do the claims in the paper match the experimental results, in terms of how far the results can be expected to generalize?
Does the paper describe the steps taken to evaluate the quality of the dataset?

For papers concerning tasks beyond language-internal matters:

Does the paper describe how the technology would be deployed in actual use cases?
Does the task carried out by the computer match how it would be deployed?
Does the paper address possible harms when the technology is being used as intended and functioning correctly?
Does the paper address possible harms when the technology is being used as intended but giving incorrect results?
Does the paper address possible harms following from potential misuse of the technology?
If the system learns from user input once deployed, does the paper describe checks and limitations to the learning?
Are any of the possible harms you’ve identified likely to fall disproportionately on populations that already experience marginalization or are otherwise vulnerable?

For papers using identity characteristics (e.g. gender, race, ethnicity) as variables:

Does the paper use self-identifications (rather than attributing identity characteristics to participants)?
Does the paper motivate the range of values used for identity characteristics in terms of how they relate to the research question?
Does the paper discuss the ethical implications of categorizing people, either in training datasets or in the deployment of the technology?

Most of the text above comes from the NAACL 2021 Ethics Review Questions; we thank NAACL 2021 Ethics Chairs, Emily M. Bender and Karën Fort for drafting the prior version.