Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TFMA analyze_raw_data function support with MultiClassConfusionMatrixPlot #162

Open
tybrs opened this issue Oct 19, 2022 · 3 comments
Open

Comments

@tybrs
Copy link

tybrs commented Oct 19, 2022

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04.2 LTS Linux 5.4.0-65-generic
  • TensorFlow Model Analysis installed from (source or binary): pip install tensorflow-model-analysis
  • TensorFlow Model Analysis version (use command below): 0.41.1
  • Python version: Python 3.8.5
  • Jupyter Notebook version: NA

Describe the problem

I am currently trying to get tfma.analyze_raw_data to work with MultiClassConfusionMatrixPlot which has multiple prediction values per record. Is this not supported? I will be happy to provide any further details or run any further tests.

Details

Currently tfma.analyze_raw_data does not seem to work with metrics for multi classification tasks (e.g. tfma.metrics.MultiClassConfusionMatrixPlot). However, I do not see this limitation documented anywhere.

The prediction column for a multi classification column will be a series of whose values are a list or array (e.g.,. pd.DataFrame({'predictions': [[0.2, .3, .5]], 'label': [1]}))

The tfma.analyze_raw_data funciton uses tfx_bsl.arrow.DataFrameToRecordBatch to convert a Pandas DataFrame to Arrow RecordBatch. The problem, however, is that it encodes columns with the dtype of object as a pyarrow.Binary. Since a column that has lists or arrays as values has a dtype of object, these columns are being encoded as a pyarrow.Binary instead of the relevant pyarrow list-like type.

Source code / logs

import tensorflow_model_analysis as tfma
from google.protobuf import text_format
import pandas as pd

eval_config = text_format.Parse("""
  ## Model information
  model_specs {
    label_key: "label",
    prediction_key: "predictions"
  }

  ## Post training metric information. These will be merged with any built-in
  ## metrics from training.
  metrics_specs {
    metrics { class_name: "MultiClassConfusionMatrixPlot" }
  }
  
  ## Slicing information
  slicing_specs {}  # overall slice
""", tfma.EvalConfig())

df = pd.DataFrame({'predictions': [[0.2, .3, .5]], 'label': [1]})
tfma.analyze_raw_data(df, eval_config)

Error

---------------------------------------------------------------------------
ArrowTypeError                            Traceback (most recent call last)
/tmp/ipykernel_206830/3947320198.py in <cell line: 23>()
     21 
     22 df = pd.DataFrame({'predictions': [[0.2, .3, .5]], 'label': [1]})
---> 23 tfma.analyze_raw_data(df, eval_config)

/localdisk/twilbers/src/repos/xai-tools/model_card_gen/.venv/lib/python3.8/site-packages/tensorflow_model_analysis/api/model_eval_lib.py in analyze_raw_data(data, eval_config, output_path, add_metric_callbacks)
   1511 
   1512   arrow_data = table_util.CanonicalizeRecordBatch(
-> 1513       table_util.DataFrameToRecordBatch(data))
   1514   beam_data = beam.Create([arrow_data])
   1515 

/localdisk/twilbers/src/repos/xai-tools/model_card_gen/.venv/lib/python3.8/site-packages/tfx_bsl/arrow/table_util.py in DataFrameToRecordBatch(dataframe)
    122       continue
    123     arrow_fields.append(pa.field(col_name, arrow_type))
--> 124   return pa.RecordBatch.from_pandas(dataframe, schema=pa.schema(arrow_fields))
    125 
    126 

/localdisk/twilbers/src/repos/xai-tools/model_card_gen/.venv/lib/python3.8/site-packages/pyarrow/table.pxi in pyarrow.lib.RecordBatch.from_pandas()

/localdisk/twilbers/src/repos/xai-tools/model_card_gen/.venv/lib/python3.8/site-packages/pyarrow/pandas_compat.py in dataframe_to_arrays(df, schema, preserve_index, nthreads, columns, safe)
    592 
    593     if nthreads == 1:
--> 594         arrays = [convert_column(c, f)
    595                   for c, f in zip(columns_to_convert, convert_fields)]
    596     else:

/localdisk/twilbers/src/repos/xai-tools/model_card_gen/.venv/lib/python3.8/site-packages/pyarrow/pandas_compat.py in <listcomp>(.0)
    592 
    593     if nthreads == 1:
--> 594         arrays = [convert_column(c, f)
    595                   for c, f in zip(columns_to_convert, convert_fields)]
    596     else:

/localdisk/twilbers/src/repos/xai-tools/model_card_gen/.venv/lib/python3.8/site-packages/pyarrow/pandas_compat.py in convert_column(col, field)
    579             e.args += ("Conversion failed for column {!s} with type {!s}"
    580                        .format(col.name, col.dtype),)
--> 581             raise e
    582         if not field_nullable and result.null_count > 0:
    583             raise ValueError("Field {} was non-nullable but pandas column "

/localdisk/twilbers/src/repos/xai-tools/model_card_gen/.venv/lib/python3.8/site-packages/pyarrow/pandas_compat.py in convert_column(col, field)
    573 
    574         try:
--> 575             result = pa.array(col, type=type_, from_pandas=True, safe=safe)
    576         except (pa.ArrowInvalid,
    577                 pa.ArrowNotImplementedError,

/localdisk/twilbers/src/repos/xai-tools/model_card_gen/.venv/lib/python3.8/site-packages/pyarrow/array.pxi in pyarrow.lib.array()

/localdisk/twilbers/src/repos/xai-tools/model_card_gen/.venv/lib/python3.8/site-packages/pyarrow/array.pxi in pyarrow.lib._ndarray_to_array()

/localdisk/twilbers/src/repos/xai-tools/model_card_gen/.venv/lib/python3.8/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status()

ArrowTypeError: ("Expected bytes, got a 'list' object", 'Conversion failed for column predictions with type object')

Temporary fix

If I change/patch tfx_bsl.arrow.DataFrameToRecordBatch as follows, it seems to work, but I doubt this is a solution.

def DataFrameToRecordBatch(dataframe):
    arrays = []
    for col_name, col_type in zip(dataframe.columns, dataframe.dtypes):
        arrow_type = None
        if col_type.kind != 'O':
            arrow_type = NumpyKindToArrowType(col_type.kind)
        arrays.append(pa.array(dataframe[col_name].values.tolist(), type=arrow_type))
    return pa.RecordBatch.from_arrays(arrays,  names=dataframe.columns)
@singhniraj08
Copy link

@tybrs,

In official documentation, it shows that multi-class classification metrics is supported by TFMA. Also, MultiClassConfusionMatrixPlot is supported by TFMA, please refer here.

Thank you!

@tybrs
Copy link
Author

tybrs commented Oct 27, 2022

@singhniraj08 thanks for getting back!

I do see that MultiClassConfusionMatrixPlot is supported, which we use regularly on TF models. The question is whether MultiClassConfusionMatrixPlot is supported with DataFrames using analyze_raw_data.

If it is supported:
(1) how should the DataFrame be structured to support multiple prediction classes
(2) how should the EvalConfig look

@embr
Copy link
Collaborator

embr commented Oct 31, 2022

Thanks for bringing this to our attention. This looks like a bug in some code that is attempting to construct pyarrow.RecordBatches in a special way. I suspect this was intended to limit the supported types to avoid problems downstream, but it seems that it is too strict right now. It is possible that this may lead to more cryptic errors downstream, but at least for the toy DataFrame you've provided, I was able to get things working by just directly calling pyarrow.RecordBatch.from_pandas which correctly handles the nested array type:

ifrom google.protobuf import text_format
import apache_beam as beam
import pandas as pd
import pyarrow as pa
import tensorflow_model_analysis as tfma
from tfx_bsl.arrow import table_util

df = pd.DataFrame({
    'predictions': [[0.6, .7, .8], [0.4, 0.5, 0.6]],
    'label': [1, 0]})

eval_config = text_format.Parse(
    """
    model_specs {
      label_key: 'label'
      prediction_key: 'predictions'
    }
    metrics_specs {
      metrics { class_name: "AUC" }
      binarize { class_ids { values: [1,2] } }
    }
    """, tfma.EvalConfig())
output_path = 'raw-analysis-output'

with beam.Pipeline() as p:
  _ = (
      p
      | 'CreateRecordBatches' >> beam.Create([
          table_util.CanonicalizeRecordBatch(
              pa.RecordBatch.from_pandas(df))])
      | 'ExtractEvaluateAndWriteResults' >> tfma.ExtractEvaluateAndWriteResults(
          eval_config=eval_config,
          output_path=output_path))

for metrics_per_slice in tfma.load_metrics(output_path):
  print(metrics_per_slice)

I'll look into supporting this case more generically, but I figured I'd share this workaround now to unblock you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants