May 17th, 2021

Instructions for Reviewers

Thank you for reviewing for EMNLP 2021! In order to ensure the quality of reviews, we would like to share with you the following instructions for reviewing EMNLP 2021 papers. Please read these instructions before you start reviewing papers.

Confidentiality

Please note that the content of any submission to EMNLP 2021, and the participants in and content of discussion on submissions, are confidential.

The Review Form

We adapted the review form from EMNLP 2020, NAACL 2021, and ACL-IJCNLP 2021. The review form consists of six main sections.

In-Depth Review: This section is for you to give your overall assessment of the paper and to provide evidence to support your opinions. There are 3 subsections:
- The core review: This is the most important part. It should include your view of the main contributions that the paper intended to make and how well it succeeds at making these contributions. From your point of view, what are the significant strong and weak parts of the paper and the work it describes? This could be a 2-paragraph (or longer) essay and/or bullet points. Remember to describe how the work advances the state of knowledge in NLP and/or highlights why it fails to make a sufficient contribution.
- Reasons to accept: please briefly summarize from your core review the main reasons why this paper should be accepted for the conference, and how the NLP community would benefit from it. You may refer back to your review to provide more context and details.
- Reasons to reject: please briefly summarize the main reasons that this paper cannot be published and presented in its current form. What are the parts that would need to be improved in order to advance the state of knowledge?
Questions and Additional Feedback for the Authors: Since we will have an author response process, for questions you would like the author(s) to respond to during the response period, please include them here. This is also the place for you to give suggestions to the authors to help them improve the paper for the final version (or a future submission).
Reproducibility, Ethics Review, Anonymity Requirement and Overall Recommendation: The questions in this section ask you to evaluate how reproducible the results in the paper are, whether there is any ethical concern, whether the anonymity is preserved and the overall recommendation. Answers to most questions in this section are only shared with the committee, except the overall recommendation.
- Reproducibility: We use a reproducibility checklist in an effort to increase reproducibility of the research work in NLP (see EMNLP 2021 call for papers). In the review form, please answer the following two questions.
  - "How do you rate the paper's reproducibility? Will members of the ACL community be able to reproduce or verify the results in this paper?" Scores of 1-5 are used to assess this aspect. The detailed explanation for each point level is provided in the review form. N/A can be used for papers that do not include empirical results.
  - "Are the authors' answers to the Reproducibility Checklist useful for evaluating the submission?". The checklist is given at the end of this document. Three choices are provided for this question (very useful, somewhat useful, not useful). Note that this question is for us to collect feedback regarding the usefulness of the reproducibility checklist, and is not about evaluating the paper itself.
- Ethical Concerns: Authors are required to honor the ethical code set out in the ACL Code of Ethics. This year, authors are allowed extra space after the 8th page (or the 4th page for short papers) for a broader impact statement or other discussion of ethics. Regardless of whether the paper includes an impact statement, you need to read the whole paper carefully and determine whether the paper has some potential ethical issues. If that is the case, you should flag the paper by choosing "yes" to the "ethical concerns" question. The flagged papers will then be reviewed by the Ethics Committee.
  - Please review the relevant Ethics review questions and the Ethics FAQ for more guidance on the problems to look out for and key concerns to consider.
  - If you flag a paper, it is very important that you provide justification as detailed as possible to help the Ethics Committee to understand your concern.
  - Not every submission includes ethics/impact statements, and most papers are unlikely to have ethical issues. Therefore, please read the statements and use your best judgment to determine whether the paper should be flagged. Please don’t automatically flag a paper simply because it includes an ethics/impact statement.
  - Regardless of whether you flag a paper, you should review the paper carefully and answer the rest of the questions on the review form. The final decision on whether or not it has ethical issues will be determined by the Ethics Committee.
- Anonymity Requirement: Please inform PCs if you know or think you know the authors of the paper and what makes you think that.
- Overall recommendation: Here you are asked to synthesize your views and come up with your recommendation for the paper. Notice that the criteria for the long papers and short papers are different.
  - We have used a 5 point scale with a half-point increment. The detailed explanation for each point level is provided in the review form. These numbers are just a concise way of expressing your overall opinion and relative importance of the factors mentioned above.
  - We allow a rating of 3 (ambivalent), but please try to take a stand on whether the paper is above or below the borderline, e.g., by selecting 2.5 or 3.5. However, if you think this is indeed a borderline paper or you are not able to decide, you can use 3.
  - Decisions will be made not just on the scores and certainly not on average scores, but will also take into account the whole review, reviewer discussion and Area Chair meta-reviews and recommendations. However it is important to align your recommendation score with the reasoning given above, so that authors will be able to understand the motivation for the recommendations and how decisions were arrived at.
- Reviewer confidence: This section should be used to inform the committee how confident you are about your recommendation, taking into account your own expertise and familiarity with this area and the paper's contents.
- Recommendations for Awards: If you believe the paper should be considered for a best paper award, indicate it here, along with the justification. Please be open-minded and feel free to nominate good quality papers even though they may not be the typical kinds. These can be a survey paper, an opinion paper, a paper about resources and datasets, a paper for low resource language, an analysis paper, etc. A committee will evaluate best paper candidates, and we would like to have a wide variety of paper types in the candidate pool, not just vanilla empirical research papers.
Changes after the Rebuttal Period:
- There will be an author response period. It is important for you to check whether author responses have cleared up your questions or misunderstandings. This may influence your overall recommendation and the core review. If that's the case, please update your recommendation and review accordingly (and state in your review any new decisions you made so that the Area Chairs are aware).
Suitability for Media Dissemination:
- We plan to invite some authors to write lay summaries of their work and share those summaries to journalists. If you believe that the paper might have particular public interest, please indicate it here, along with the justification.
Confidential Information: Your answers to questions in this section will not be shared with the authors.
- Confidential Comments to the Area Chair and Peer Reviewers: If you want to share some information with the area chair (meta-reviewer) and other reviewers, please state it here.
- Confidential Comments to Senior Area Chairs and PC Chairs: Use this box for information that you only want to share with the senior area chairs and program chairs.

Approach

Please take a balanced approach when reading these papers. One objective is to have a solid technical program; hence, it is important to be thorough. On the other hand, we also want a broad and interesting program - so please do not be picky. You were selected to serve on the committee because you are a respected contributor to your field. You probably received papers that are not up to your personal standards. Some may still have technical merit, and could be interesting to others. Please try to keep an open mind. Don't get hung up about recommending too many papers. If you think a paper will be interesting to attendees, then recommend it.

Supplementary Materials

Supplementary materials are allowed as a stand-alone document uploaded as an additional file. Supplementary materials are, as the name suggests, supplementary, and you have no obligation to read them. You should treat them like other citations in submissions that may be helpful in understanding background or details beyond the scope of the paper itself.

However, as noted above, given the new requirement for reproducibility, authors may provide additional information about their datasets and experiments in Appendix, and attach a zip file with resources such as code and data. Please take some time to check those, if applicable. If the data/code is provided as a hyperlink, please make sure to flag cases where this breaches the anonymity rules.

Secondary Reviewers

As in most previous NLP conferences, you are allowed to solicit help from others. However, when it comes to writing the final review and giving the final scores, we expect you to take the secondary reviewer's review and rewrite it using your own words and adjust the scores when you see fit. Essentially, the final review should reflect your own opinions about the paper, and you need to be able to justify the opinions you present in the final review.

Format of Submissions

The program chairs and area chairs have already identified submissions that violated the formatting guidelines and have desk-rejected those submissions. However, if you think the paper still violates the format guidelines, please contact your area chairs or PCs immediately.

Important Dates

Your initial reviews are due Monday, July 5, 2021 (11:59pm anywhere on Earth). Additionally, your input is needed during the reviewer discussion period from July 18 to 28, after the author response. Based on this discussion, you are expected to double check and consider updating your review and recommendation where relevant. Your duties are summarised below:

June 10 – July 5: Review Period
July 18 – 28: Reviewer discussion period (and please update reviews where applicable)

Review Advice

Please read the post on writing good reviews from EMNLP-2020. ACs will be instructed to flag poor reviews, ask reviewers to revise their reviews or provide objective reasons to justify their positions.

Findings of EMNLP

We will continue providing the acceptance option of Findings of EMNLP, introduced in EMNLP-2020, for papers that narrowly miss out on publication in EMNLP, but are judged to be worthy of publication. This will affect the acceptance decisions made by ACs, SACs and the PCs, but will not require any specific inputs from reviewers.

Reproducibility Checklist

For all reported experimental results:
(ModelDescription) A clear description of the mathematical setting, algorithm, and/or model
(LinkToCode) Submission of a zip file containing source code, with specification of all dependencies, including external libraries, or a link to such resources (while still anonymized)
(Infra) Description of computing infrastructure used
(Runtime) The average runtime for each model or algorithm (e.g., training, inference, etc.), or estimated energy cost
(Parameters) Number of parameters in each model
(ValidationPerf) Corresponding validation performance for each reported test result
(Metrics) Explanation of evaluation metrics used, with links to code

For all experiments with hyperparameter search:
(NoRuns) The exact number of training and evaluation runs
(HyperBound) Bounds for each hyperparameter
(HyperBestConfig) Hyperparameter configurations for best-performing models
(HyperSearch) Number of hyperparameter search trials
(HyperMethod) The method of choosing hyperparameter values (e.g., uniform sampling, manual tuning, etc.) and the criterion used to select among them (e.g., accuracy)
(ExpectedPerf) Summary statistics of the results (e.g., mean, variance, error bars, etc.)

For all datasets used:
(DataStats) Relevant details such as languages, and number of examples and label distributions
(DataSplit) Details of train/validation/test splits
(DataProcessing) Explanation of any data that were excluded, and all pre-processing steps
(DataDownload) A zip file containing data or link to a downloadable version of the data
(Newdata) For new data collected, a complete description of the data collection process, such as instructions to annotators and methods for quality control