Computer Science Homework Help

SJSU The Property of An Object and The Data Attribute Discussion

 

Answer 1

An attribute is an important feature of data mining that is referred to as the property of an object or its unique characteristics. An attribute set has great importance in data mining and it could be defined as an object that keeps a hold of the records of different instances. There are different types of attributes or data that include nominal attribute, ordinal attribute, binary attribute, numeric attribute, interval scaled attribute, and ratio scaled attribute. Hyperparameters are another crucial tool in statistics and data mining that defines a parameter ranging from a previous distribution. Hyperparameter looks into and observes a past belief or stored data before observing the new incoming data. In any kind of data mining algorithm, the hyperparameters are needed to be initialized before working on the training of a new model (Prabhu, 2018). A decision tree is also a useful part of data mining that is a structure comprising of a unique root node, several branches, and certain leaf nodes. Quality is meant by an interior hub, each branch gives the sign of the result of the test, and the leaf hub is the delegate of a class mark. Various kinds of traits or information incorporate ostensible property, ordinal quality, twofold characteristic, numeric characteristic, stretch scaled characteristic, and proportion scaled quality. Hyperparameters are another pivotal device in insights and information mining that characterizes a boundary going from past dissemination. An attribute is denoted by an internal node, every branch indicates the outcome of the test, and the leaf node is the representative of a class label.

REFERENCES

Geeksforgeeks. (2020). Data Mining: Data Attributes and Quality. Retrieved from https://www.geeksforgeeks.org/data-mining-data-attributes-and-quality/

Prabhu. (2018). Understanding Hyperparameters and their Optimisation techniques. Retrieved from https://towardsdatascience.com/understanding-hyper…

—————————————————————————————————————————————-

Answer 2

Discussion#2

Chapter#2: Q1: An attribute can be defined as a trait or a feature of an object that can differs, either from time to time or object to object. An attribute can also be called as field, characteristic, variable, dimension, or feature (Tan et.al., 2019).

As we all know that a data set can frequently be seen as an assortment of data objects. These data objects are portrayed by various fields, for example the mass of an actual article or the time at which an occasion happened. To characterize the specific object by using a group of attributes is referred to as attribute vector or feature vector.

Chapter#2: Q2: According to the GeeksforGeeks (2021) article, the following are the attribute types. They are as follows:

Qualitative Attributes:

Nominal Attributes: The entries of a Nominal attribute are titles of objects, or signs of certain type. Nominal attribute entries reflect either categorization or situation, which is why nominal attributes are sometimes known as categorical attributes, and there is no hierarchy between nominal attribute entries.

Binary Attributes: There are only two entities/values present in binary information. The information can be either Yes/No or True/False.

Ordinal Attributes: The Ordinal Attributes include entities that have a logical sequencing or grading among them, but the quantity among entities is unknown; the sequence of entities that reveals what is significant but does not tell how essential it is.

Quantitative Attributes:

Numeric: A numeric is a quantifiable number that may be expressed as integer or real values. There are two kinds of numerical attributes: intervals and ratios.

An interval-scaled attribute has values with decipherable disparities, but numerical attributes lack the right point of reference, sometimes known as zero points.

A quantitative attribute with a fixed zero-point is a ratio-scaled attribute. If a metric is ratio-scaled, we may claim that one value is a multiple of another.

Discrete: Discrete data has limited values and might be numerical or categorical in nature. These attributes have a limited or countably unlimited number of possible values.

Continuous: Continuous data has an unlimited number of states. Continuous data is of the type of float. There are several possible values between 2 and 3.

Chapter#3: Q1: The task of providing labels to unlabeled data samples is known as classification, and a classifier is employed to do this. A a classifier is generally defined in terms of a model. The model is built using a collection of samples called as the training set, which includes attribute values as well as class labels for each sample. A learning algorithm is a structured technique for learning a classification model provided a training set. Induction is the process of applying a learning algorithm to construct a classification model from training data. This technique is also known as “learning a model” or “building a model.” This process of applying classification models to invisible test cases to predict their class labels is called deduction. Therefore, the classification process includes two steps: applying the learning algorithm to the training data to learn the model, and then applying the model to assign labels to unlabeled instances (Tan et.al., 2019).

Chapter#3: Q2: A hyperparameter is a preceding parameter in data mining that must be adjusted to improve it. By glancing at the real data causes tendency, because such parameters should be changed on the training set (Glen, 2018). A hyper-parameter selection is a technique where the values of hyper-parameters must be decided throughout model selection and must be considered during model assessment.

Chapter#3: Q3: According to Tan et.al., (2019), a decision tree is a hierarchical architecture that organizes a sequence of queries and their potential solutions. The nodes of a tree are of three types:

A root node with no inbound connections and zero or more outbound connections.

Internal node has precisely one inbound connection and two or more outbound connections.

Leaf or Terminal node has precisely one inbound connection and no outbound connections.

Reference

Tan, P.N., Steinbach, M., Karpatne, A., Kumar, V. (2019). Introduction to Data Mining. 2nd Edition. Pearson.

GeeksforGeeks. (2021, January 19). Understanding Data Attribute Types | Qualitative and Quantitative. https://www.geeksforgeeks.org/understanding-data-a…

Glen, S. (2018, March 15). Hyperparameter: Simple Definition. Statistics How To. https://www.statisticshowto.com/hyperparameter/