K Nearest Neighbors
Example 1
Summary: Performs a K-Nearest Neighbors analysis on the 'data_table.jmp' table, generating reports and visualizations for training and prediction.
Code:
dt = Open("data_table.jmp");
knn = dt << K Nearest Neighbors(
Y( :country ),
X( :sex, :marital status, :age, :type ),
K( 10 ),
Category Bias( 0.2 ),
SendToReport(
Dispatch( {"country"}, "Training", OutlineBox, {Close( 1 )} ),
Dispatch( {"country", "Confusion Matrix for Best K=1"}, "Training", OutlineBox, {Close( 1 )} ),
Dispatch( {"country", "Mosaic Plot"}, "Mosaic Plot for K=1", TextBox, {Set Font Scale( 2 )} ),
Dispatch( {"country", "Mosaic Plot", "Training"}, "country Predicted", TextEditBox, {Text Color( "Red" ), Font Color( 3 )} ),
Dispatch( {"country", "Mosaic Plot", "Training"}, "country", TextEditBox, {Text Color( "Red" ), Font Color( 3 )} )
)
);
cs = knn << Column Switcher( :country, {:country, :size} );
Code Explanation:
- Open table.
- Perform KNN analysis.
- Set response variable.
- Set predictor variables.
- Specify number of neighbors.
- Set category bias.
- Close training report.
- Close confusion matrix report.
- Increase font size in mosaic plot.
- Change text color in mosaic plot.
Example 2
Summary: Performs a K Nearest Neighbors analysis to predict country based on sex, marital status, age, and type, with category bias applied and interactive column switching for size.
Code:
dt = Open("data_table.jmp");
knn = dt << K Nearest Neighbors(
Y( :country ),
X( :sex, :marital status, :age, :type ),
K( 10 ),
Category Bias( 0.2 ),
SendToReport(
Dispatch( {"country"}, "Training", OutlineBox, {Close( 1 )} ),
Dispatch( {"country", "Mosaic Plot", "Training"}, "country Predicted", TextEditBox, {Text Color( "Red" ), Font Color( 3 )} ),
Dispatch( {"country", "Mosaic Plot", "Training"}, "country", TextEditBox, {Text Color( "Red" ), Font Color( 3 )} )
)
);
cs = knn << Column Switcher( :country, {:country, :size} );
cs << Set Current( "size" );
Code Explanation:
- Open data table.
- Perform K Nearest Neighbors analysis.
- Set Y variable to country.
- Set X variables: sex, marital status, age, type.
- Use 10 nearest neighbors.
- Apply category bias of 0.2.
- Close training outline box.
- Change predicted country text color to red.
- Change actual country text color to red.
- Switch column to size.
Example 3
Summary: Performs a K Nearest Neighbors analysis to predict country based on sex, marital status, and age, with a column switcher for dynamic visualization.
Code:
dt = Open("data_table.jmp");
knn = dt << K Nearest Neighbors(
Y( :country ),
X( :sex, :marital status, :age ),
K( 10 ),
Category Bias( 0.2 ),
Response( "country", Mosaic Plot( 0 ) )
);
cs = knn << Column Switcher( :country, {:country, :size, :type} );
Code Explanation:
- Open data table;
- Run K Nearest Neighbors analysis.
- Set response variable as country.
- Use sex, marital status, age as predictors.
- Specify K value as 10.
- Apply category bias of 0.2.
- Generate mosaic plot for country response.
- Add column switcher to analysis.
- Switch country column to size.
- Include type column in switcher.
Example 4
Summary: Performs a K Nearest Neighbors analysis to predict country based on sex, marital status, and age, with 10 nearest neighbors and category bias of 0.2, and generates a mosaic plot for visualization.
Code:
dt = Open("data_table.jmp");
knn = dt << K Nearest Neighbors(
Y( :country ),
X( :sex, :marital status, :age ),
K( 10 ),
Category Bias( 0.2 ),
Response( "country", Mosaic Plot( 0 ) )
);
cs = knn << Column Switcher( :country, {:country, :size, :type} );
cs << Set Current( "size" );
Code Explanation:
- Open data table;
- Run K Nearest Neighbors analysis.
- Set response variable to "country".
- Include "sex", "marital status", "age" as predictors.
- Use 10 nearest neighbors.
- Apply category bias of 0.2.
- Generate mosaic plot for "country".
- Create column switcher for "country".
- Include "country", "size", "type" in switcher.
- Set current column to "size".
Example 5
Summary: Runs the creation and iteration of K Nearest Neighbors models for data analysis, utilizing random seed initialization and window closing.
Code:
dt = Open("data_table.jmp");
obj = dt << K Nearest Neighbors(
Validation( :Validation ),
Y( :Y Binary ),
X( :Age, :Gender, :BMI, :BP, :Total Cholesterol, :LDL, :HDL, :TCH, :LTG, :Glucose ),
K( 9 ),
Set Random Seed( 132752 )
);
rn1 = [];
For( i = 1, i <= 10, i++,
rn1 |/= Random Uniform()
);
obj << close window( 1 );
For( i = 1, i <= 10, i++,
rn2 = [];
obj2 = dt << K Nearest Neighbors(
Validation( :Validation ),
Y( :Y Binary ),
X( :Age, :Gender, :BMI, :BP, :Total Cholesterol, :LDL, :HDL, :TCH, :LTG, :Glucose ),
K( 9 ),
Set Random Seed( 132752 )
);
For( j = 1, j <= 10, j++,
rn2 |/= Random Uniform()
);
obj2 << close window( 1 );
);
Code Explanation:
- Open data table;
- Create K Nearest Neighbors model.
- Set validation column.
- Define response variable.
- Specify predictor variables.
- Set number of neighbors (K).
- Initialize random seed.
- Generate 10 random numbers.
- Close first model window.
- Repeat steps 2-9 for 10 iterations.
Example 6
Summary: Executes K Nearest Neighbors models with random seed initialization and window closing, iterating 10 times to generate multiple model runs.
Code:
dt = Open("data_table.jmp");
obj = dt << K Nearest Neighbors(
Validation( :Validation ),
Y( :Y Binary ),
X( :Age, :Gender, :BMI, :BP, :Total Cholesterol, :LDL, :HDL, :TCH, :LTG, :Glucose ),
K( 9 ),
Set Random Seed( 132752 )
);
rn1 = [];
For( i = 1, i <= 100, i++,
rn1 |/= Random Uniform()
);
obj << close window( 1 );
For( i = 1, i <= 10, i++,
rn2 = [];
obj2 = dt << K Nearest Neighbors(
Validation( :Validation ),
Y( :Y Binary ),
X( :Age, :Gender, :BMI, :BP, :Total Cholesterol, :LDL, :HDL, :TCH, :LTG, :Glucose ),
K( 9 ),
Set Random Seed( 132752 )
);
obj2 << close window( 1 );
);
Code Explanation:
- Open data table;
- Run K Nearest Neighbors model.
- Initialize random number list.
- Generate 100 random numbers.
- Close first KNN window.
- Loop 10 times.
- Initialize second random number list.
- Run K Nearest Neighbors model again.
- Close second KNN window.
K Nearest Neighbors using If
Example 1
Summary: Performs a K Nearest Neighbors analysis on a data table, utilizing local data filtering and report dispatching for training, validation, and test sets.
Code:
If( Contains( JMP Product Name(), "Pro" ),
Open("data_table.jmp") << K Nearest Neighbors(
Validation( :Validation ),
Y( :BAD ),
X( :LOAN, :MORTDUE, :VALUE, :REASON, :JOB, :YOJ, :DEROG, :DELINQ, :CLAGE, :NINQ, :CLNO ),
K( 10 ),
Local Data Filter(
Inverse( 1 ),
Add Filter(
columns( :DEBTINC, :JOB ),
Where( :DEBTINC >= 0.524499215429881 & :DEBTINC <= 76.56 ),
Display( :JOB, Size( 149, 119 ) )
)
),
SendToReport(
Dispatch( {"BAD"}, "Training", OutlineBox, {Close( 1 )} ),
Dispatch( {"BAD"}, "Validation", OutlineBox, {Close( 1 )} ),
Dispatch( {"BAD"}, "Test", OutlineBox, {Close( 1 )} )
)
)
);
Code Explanation:
- Check for JMP Pro version.
- Open data table;
- Apply K Nearest Neighbors method.
- Use Validation column for validation.
- Set BAD as response variable.
- Include specified predictors.
- Set K value to 10.
- Apply local data filter.
- Invert filter condition.
- Close Training, Validation, Test reports.
Example 2
Summary: Performs a K Nearest Neighbors analysis in JMP Pro, specifying the response variable and predictor variables, and retrieving the misclassification rate.
Code:
If( Contains( JMP Product Name(), "Pro" ),
dt = Open("data_table.jmp");
obj = dt << K Nearest Neighbors(
Y( :Species, ),
X( :Sepal length, :Sepal width, :Petal length, :Petal Width ),
K( 149 ),
Set Random Seed( 5384 )
);
rpt = obj << report;
mr = rpt[Outline Box( "Training" )][Number Col Box( "Misclassification Rate" )] << get as matrix;
minloc = Min( Loc( mr, Min( mr ) ) );
getbest = obj << (Response[1] << Get Best K);
Close( dt, no save );
);
Code Explanation:
- Check for JMP Pro.
- Open data table;
- Run K Nearest Neighbors.
- Specify response variable.
- Define predictor variables.
- Set K value to 149.
- Set random seed to 5384.
- Retrieve report object.
- Extract misclassification rate.
- Find minimum misclassification rate location.
Example 3
Summary: Performs a K Nearest Neighbors analysis with validation, response variable specification, and predictor variables definition in JMP Pro.
Code:
If( Contains( JMP Product Name(), "Pro" ),
dt = Open("data_table.jmp");
obj1 = dt << K Nearest Neighbors(
Validation( :Validation ),
Y( :Y Binary ),
X( :Age, :Gender, :BMI, :BP, :Total Cholesterol, :LDL, :HDL, :TCH, :LTG, :Glucose ),
Random Seed( 12354 ),
K( 10 )
);
obj1 << (Response[1] << Set K( 6 ));
rpt1 = obj1 << report;
obj2 = obj1 << Redo Analysis( 1 );
rpt2 = obj2 << report;
obj1 << Save Script to Report( 1 );
saved1 = rpt1[Text Box( 1 )] << get text;
Close( dt, no save );
);
Code Explanation:
- Check if JMP Pro is installed.
- Open data table;
- Run K Nearest Neighbors analysis.
- Set validation column.
- Specify response variable.
- Define predictor variables.
- Set random seed for reproducibility.
- Set K value to 10.
- Adjust K value to 6 for response.
- Generate first report.
- Redo analysis with new settings.
- Generate second report.
- Save script to report.
- Extract saved script text.
- Close dataset without saving.
Example 4
Summary: Runs a K Nearest Neighbors model to predict outcomes and calculates the RSquare ratio between training and validation sets, utilizing JMP Pro.
Code:
If( Contains( JMP Product Name(), "Pro" ),
dt = Open("data_table.jmp");
obj = dt << K Nearest Neighbors(
Validation( :Validation ),
Y( :Y ),
X( :Age, :Gender, BMI, :BP, :Total Cholesterol, :LDL, :HDL, :TCH ),
K( 5 ),
Set Random Seed( 123 )
);
rpt = obj << report;
rsquare = (rpt["Y"]["Training"][Number Col Box( "RSquare" )] << get as matrix) |/ (rpt["Y"]["Validation"][Number Col Box( "RSquare" )]
<< get as matrix);
obj << (Response[1] << Save Predicteds);
obj2 = dt << Model Comparison( Group( :Validation ) );
rpt2 = obj2 << report;
b rsquare = rpt2["Measures of Fit for Y"][Number Col Box( "RSquare" )] << get as matrix;
Close( dt, no save );
);
Code Explanation:
- Check if JMP is Pro.
- Open data table;
- Run K Nearest Neighbors model.
- Extract training RSquare.
- Extract validation RSquare.
- Calculate RSquare ratio.
- Save predictions.
- Run Model Comparison.
- Extract overall RSquare.
- Close dataset without saving.
Example 5
Summary: Executes K-Nearest Neighbors models on a data table, with and without grouping by a categorical column.
Code:
If( Contains( JMP Product Name(), "Pro" ),
dt = Open("data_table.jmp");
dt << New Column( "Group", formula( Random Integer( 1, 2 ) ) );
ncols1 = N Cols( dt );
obj1 = dt << K Nearest Neighbors( Y( :Species ), X( :Sepal length, :Sepal width, :Petal length, :Petal width ), K( 3 ), Nonrandom );
obj1 << (Response[1] << Save Prediction Formula( 1 ));
obj1 << (Response[1] << Save Prediction Formula( 2 ));
obj1 << (Response[1] << Save Prediction Formula( 3 ));
obj2 = dt << K Nearest Neighbors(
Y( :Species ),
X( :Sepal length, :Sepal width, :Petal length, :Petal width ),
By( :Group ),
K( 3 ),
Nonrandom
);
obj2[1] << (Response[1] << Save Prediction Formula( 1 ));
obj2[1] << (Response[1] << Save Prediction Formula( 2 ));
obj2[1] << (Response[1] << Save Prediction Formula( 3 ));
ncols2 = N Cols( dt );
Close( dt, no save );
);
Code Explanation:
- Check if JMP is Pro version.
- Open data table;
- Add new "Group" column.
- Count initial columns.
- Run K-Nearest Neighbors model.
- Save prediction formula for response 1.
- Save prediction formula for response 2.
- Save prediction formula for response 3.
- Run K-Nearest Neighbors model by group.
- Save prediction formula for response 1 in grouped model.
- Save prediction formula for response 2 in grouped model.
- Save prediction formula for response 3 in grouped model.
- Count updated columns.
- Close dataset without saving.
K Nearest Neighbors using Shape
Example 1
Summary: Runs the K Nearest Neighbors analysis by opening a data table, setting a random seed, extracting predictor variables, scaling data, creating a KDTable, finding nearest neighbors, and saving near neighbor rows and distances.
Code:
dt = Open("data_table.jmp");
r = 5384;
m = 6;
mdata = (dt << get as matrix)[0, 1 :: 4];
scl data = mdata :/ Shape( V Std( mdata ), N Rows( mdata ), N Cols( mdata ) );
{b knnrows, b dist} = KDTable( scl data ) << K nearest rows( Eval( m ) );
obj = dt << K Nearest Neighbors( X( :Sepal length, :Sepal width, :Petal length, :Petal width ), K( m ), Set Random Seed( r ) );
test1 = Try( Report( obj )[Outline Box( 1 )] << get title, "No Report" );
If( test1 == "K Nearest Neighbors",
obj << Save Near Neighbor Rows( 1 );
knnrows = (dt << get as matrix)[0, 5 :: 4 + m];
obj << Save Near Neighbor Distances( 1 );
dist = (dt << get as matrix)[0, 11 :: 16];
Close( dt, no save );
);
Code Explanation:
- Open data table;
- Set random seed.
- Extract predictor variables.
- Scale data.
- Create KDTable.
- Find nearest neighbors.
- Run K Nearest Neighbors analysis.
- Check if report exists.
- Save near neighbor rows.
- Save near neighbor distances.
Example 2
Summary: Runs K-Nearest Neighbors (KNN) analysis on a data table, utilizing standardization and KDTable creation to find the nearest rows.
Code:
dt = Open("data_table.jmp");
r = 5384;
m = 6;
mdata = (dt << get as matrix)[0, 1 :: 4];
scl data = mdata :/ Shape( V Std( mdata ), N Rows( mdata ), N Cols( mdata ) );
{b knnrows, b dist} = KDTable( scl data ) << K nearest rows( Eval( m ) );
obj = dt << K Nearest Neighbors( X( :Sepal length, :Sepal width, :Petal length, :Petal width ), K( m ), Set Random Seed( r ) );
test1 = Try( Report( obj )[Outline Box( 1 )] << get title, "No Report" );
Code Explanation:
- Open data table;
- Set random seed.
- Define number of neighbors.
- Extract feature columns.
- Standardize feature data.
- Create KDTable.
- Find K nearest rows.
- Perform KNN analysis.
- Retrieve report outline.
- Get title or default message.
K Nearest Neighbors using New Column
Summary: Runs K-Nearest Neighbors analysis on a standardized dataset, generating a report and saving near neighbor rows and distances.
Code:
dt = Open("data_table.jmp");
dt << New Column( "Species_Num", formula( Match( :Species, "setosa", 1, "versicolor", 2, "virginica", 3 ) ) );
act1 = dt:Species_Num << get values;
m = 6;
mdata = (dt << get as matrix)[0, 1 :: 4];
scl data = mdata :/ Shape( V Std( mdata ), N Rows( mdata ), N Cols( mdata ) );
{b knnrows, b dist} = KDTable( scl data ) << K nearest rows( Eval( m ) );
obj = dt << K Nearest Neighbors( X( :Sepal length, :Sepal width, :Petal length, :Petal width ), K( m ) );
rpt = Report( obj );
test1 = Try( Report( obj )[Outline Box( 1 )] << get title, "No Report" );
If( test1 == "K Nearest Neighbors",
obj << Save Near Neighbor Rows( 1 );
knnrows = (dt << get as matrix)[0, 6 :: 5 + m];
obj << Save Near Neighbor Distances( 1 );
dist = (dt << get as matrix)[0, 12 :: 17];
);
Code Explanation:
- Open data table;
- Create Species_Num column.
- Retrieve Species_Num values.
- Set K value to 6.
- Extract first four columns as matrix.
- Standardize the data matrix.
- Build KDTree for standardized data.
- Perform K-Nearest Neighbors analysis.
- Generate report from analysis.
- Check if report exists.
- Save near neighbor rows if report valid.
- Extract saved near neighbor rows.
- Save near neighbor distances if report valid.
- Extract saved distances.