-
Notifications
You must be signed in to change notification settings - Fork 2
Simple question parsing. #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
stopping
wants to merge
1
commit into
cleichner:master
Choose a base branch
from
stopping:question-parsing
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,6 @@ | ||
package com.whichclasses.scraper.page; | ||
|
||
import java.text.ParseException; | ||
import java.util.Collection; | ||
import java.util.List; | ||
import java.util.regex.Matcher; | ||
|
@@ -26,7 +27,7 @@ private static class ParserException extends RuntimeException { | |
public ParserException(String message, Throwable e) { super(message, e); } | ||
} | ||
|
||
public static TceClassProto parseTceClassPage(Document page) { | ||
public static TceClassProto parseTceClassPage(Document page) throws ParseException { | ||
return new TceClassPageParser(page).buildModel(); | ||
} | ||
|
||
|
@@ -63,7 +64,7 @@ private enum QuestionParsingState { | |
STATE_READING_RESULTS, | ||
} | ||
|
||
private TceClassProto buildModel() { | ||
private TceClassProto buildModel() throws ParseException { | ||
// The entire course row is a jumble. | ||
String courseRowText = getElementTagNameContainingText(metadataTable, "tr", "Course:").text(); | ||
Matcher courseRowMatcher = COURSE_ROW_PATTERN.matcher(courseRowText); | ||
|
@@ -132,11 +133,10 @@ private TceClassProto buildModel() { | |
} | ||
|
||
String questionText = questionRow.text(); | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: remove blank line |
||
Question question = parseQuestion(questionText); | ||
builder = TceRating.newBuilder() | ||
// TODO: which question is this? Different wordings map to | ||
// the same concept here. Translate from questionText, throw | ||
// if unrecognized. | ||
.setQuestion(Question.OVERALL_COURSE_RATING); | ||
.setQuestion(question); | ||
state = QuestionParsingState.STATE_READING_RESULTS; | ||
valueIndex = 0; | ||
|
||
|
@@ -163,6 +163,41 @@ private TceClassProto buildModel() { | |
.build(); | ||
} | ||
|
||
private Question parseQuestion(String questionText) throws ParseException { | ||
switch(questionText) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. style nit: space before "(" https://google-styleguide.googlecode.com/svn/trunk/javaguide.html#s4.6.2-horizontal-whitespace |
||
case "What is your overall rating of the instructor(s) teaching effectiveness in the online environment?": | ||
case "What is your overall rating of this instructor's teaching effectiveness?": | ||
return Question.OVERALL_INSTRUCTOR_RATING; | ||
case "What is your rating of this instructor compared with other instructors you have had?": | ||
return Question.COMPARATIVE_INSTRUCTOR_RATING; | ||
case "What is your overall rating of this course": | ||
case "What is your overall rating of this course?": | ||
return Question.OVERALL_COURSE_RATING; | ||
case "How much do you feel you have learned in this course?": | ||
return Question.COURSE_LEARNING; | ||
case "The difficulty level of the course is": | ||
case "How difficult was this course for you?": | ||
case "The difficulty level of the course is.": | ||
return Question.COURSE_DIFFICULTY; | ||
case "Rate the usefulness of the outside assignments (homework, papers, reports, and special projects, etc.) in helping you learn.": | ||
case "Rate the overall usefulness of outside (not in class) assignments (homework, papers, reports, special projects, online work, etc.) in helping you achieve important course goals and objectives": | ||
return Question.EXTERNAL_WORK_RATING; | ||
case "Rate the overall usefulness of in-class meeting time activities (e.g. lectures, discussions, teamwork, labs, etc.) in helping you achieve important course goals and objectives": | ||
return Question.INTERNAL_WORK_RATING; | ||
case "The materials used in this course (text, readings, websites, etc.) are.": | ||
case "Rate the overall usefulness of assigned texts and readings (print or online) in helping you achieve important course goals and objectives": | ||
return Question.COURSE_MATERIALS_RATING; | ||
case "I was treated with respect in this class.": | ||
return Question.INSTRUCTOR_RESPECT; | ||
case "Outside of class time, about how many hours per week have you spent on class-related work (reading, reviewing notes, writing papers, team meetings, etc.)?": | ||
case "On average, how many hours per week do you spend on this class, including attending class, doing readings, reviewing notes, and any other course-related work?": | ||
return Question.TIME_SPENT_EXTERNALLY; | ||
case "Of the total hours you spent on this class, how many were valuable in advancing your education?.": | ||
return Question.TIME_VALUE_RATING; | ||
} | ||
throw new ParseException("Could not parse question: \"" + questionText + "\"", 0); | ||
} | ||
|
||
/** | ||
* "Spring-2009" --> 091 | ||
* @param termString | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of needing "throws" declarations---they end up being awkward to use due to language requirements. I think your implementation re-throwing it a couple levels and then ending up with a printStackTrace and a TODO is evidence of it being an "anti-pattern". Unfortunately, ParseException is an Exception and not a RuntimeException, so the language leaves you no choice (again, "anti-pattern").
How do you feel about just throwing it as a RuntimeException (to avoid the language shenanigans)? Then later, for proper handling, we can install a few high-level exception handlers: e.g. an Exception caught while doing downloading/parsing of all the pages could gracefully retry (for something like a network failure) or forcefully exit (for a deterministic parsing failure), while a frontend failure while serving the site should be logged but not bring down the server.