The Categorizer Module orchestrates the activity associated with categorizing the text. It contains the logic for common aspects of categorization shared across zero and few shot cases.
BaseCategorizer
Bases: WorkflowHandler
, ABC
The BaseCategorizer
class is a subclass of WorkflowHandler
and an abstract base class (ABC) with
an empty constructor.
Source code in LabeLMaker/Categorize/categorizer.py
19 20 21 22 23 24 25 26 |
|
LabeLMaker
Bases: BaseCategorizer
The function initializes attributes for a prompt generator class with specified parameters.
Parameters: |
|
---|
the path to a file or directory related to the prompt.
categorzation_request (CategorizationRequest): It looks like there is a typo in the parameter name
categorzation_request
. It should be categorization_request
instead.
llm_interface: The llm_interface
parameter in the __init__
method is a default parameter with
a default value of Config.LLM_INTERFACE
. This means that if no value is provided for
llm_interface
when creating an instance of the class, it will default to the value specified in
Source code in LabeLMaker/Categorize/categorizer.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 |
|
categorize_item(item_args)
The categorize_item
function categorizes an item based on input arguments and handles exceptions.
:param item_args: The item_args
parameter seems to be a dictionary that is being passed to the
categorize_item
method. It likely contains information or data related to an item that needs to be
categorized
:return: The categorize_item
method returns the result of invoking the chain
with the
item_args
provided. If the result is None
, it raises a ValueError
indicating that the chain
returned None
for the input. If an exception occurs during the processing of the item, it prints
an error message with the item details and the exception message, then prints the traceback
Source code in LabeLMaker/Categorize/categorizer.py
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
|
categorize_text_with_retries(text_to_categorize, max_retries=Config.MAX_RETRIES)
The function categorize_text_with_retries
categorizes text with retry logic to ensure a valid
category is obtained.
:param text_to_categorize: The text_to_categorize
parameter is a string that represents the text
that needs to be categorized. It is the input text that will be processed and categorized by the
categorize_text_with_retries
method
:type text_to_categorize: str
:param max_retries: The max_retries
parameter in the categorize_text_with_retries
method
specifies the maximum number of retries allowed when attempting to categorize the text. If the
categorization process fails to produce a valid category within the specified number of retries, the
method will return "Uncategorized"
:type max_retries: int
:return: The categorize_text_with_retries
method returns a tuple containing the rationale
and
category
of the text after attempting to categorize it with retry logic. If a valid category is
not obtained after the maximum number of retries, it sets the category to "Uncategorized".
Source code in LabeLMaker/Categorize/categorizer.py
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
|
create_chain()
The create_chain
function returns the result of combining the prompt_template
and llm
attributes.
:return: The create_chain
method combines the instructions of
self.prompt_template
and self.llm
.
Source code in LabeLMaker/Categorize/categorizer.py
69 70 71 72 73 74 75 76 |
|
extract_category(content)
staticmethod
The function extract_category
takes a string input and extracts the category information following
the "Category:" keyword.
:param content: Thank you for providing the code snippet. It looks like you are trying to extract
the category from a given content string using a regular expression pattern
:type content: str
:return: The function extract_category
returns the category extracted from the input content
string. If the string contains a pattern "Category: " followed by any characters, the function will
extract and return those characters as the category. If the pattern is not found in the input
string, the function will return None
.
Source code in LabeLMaker/Categorize/categorizer.py
177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
|
extract_rationale(content)
staticmethod
This Python function extracts the rationale from a given content using a regular expression pattern.
:param content: It looks like you have provided the code snippet for a function called
extract_rationale
that extracts the rationale from a given content string using a regular
expression pattern. The rationale is expected to be found after the text "Rationale:" and before the
text "Category:" or the end of the string
:type content: str
:return: The function extract_rationale
returns the rationale extracted from the input content
string based on the provided regex pattern. If a match is found, it returns the extracted rationale
text after stripping any leading or trailing whitespaces. If no match is found, it returns None
.
Source code in LabeLMaker/Categorize/categorizer.py
156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 |
|
process()
The process
function categorizes text data with retries and returns the results.
:return: The process
method is returning a list of tuples where each tuple contains the original
text, the category assigned to that text, and the rationale for the categorization.
Source code in LabeLMaker/Categorize/categorizer.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 |
|