Author: saqibkhan
-
What are correlation and covariance, and how do you calculate them in R?
Correlation is a measure of the strength and direction of the linear relationships between two variables. It takes values from -1 (a perfect negative correlation) to 1 (a perfect positive correlation). Covariance is a measure of the degree of how two variables change relative to each other and the direction of the linear relationships between…
-
How to select features for machine learning in R?
Let’s consider three different approaches and how to implement them in the caret package. We need to create a correlation matrix of all the features and then identify the highly correlated ones, usually those with a correlation coefficient greater than 0.75: We need to create a training scheme to control the parameters for train, use…
-
What are regular expressions, and how do you work with them in R?
A regular expression, or regex, in R or other programming languages, is a character or a sequence of characters that describes a certain text pattern and is used for mining text data. In R, there are two main ways of working with regular expressions:
-
List and define the control statements in R.
There are three groups of control statements in R: conditional statements, loop statements, and jump statements. Conditional statements: Loop statements: Jump statements:
-
What is the difference between the functions apply(), lapply(), sapply(), and tapply()?
While all these functions allow iterating over a data structure without using loops and perform the same operation on each element of it, they are different in terms of the type of input and output and the function they perform.
-
What is the use of the switch() function in R?
The switch() function in R is a multiway branch control statement that evaluates an expression against items of a list. It has the following syntax: The expression passed to the switch() function can evaluate to either a number or a character string, and depending on this, the function behavior is different. 1. If the expression evaluates to a number,…
-
How to parse a date from its string representation in R?
To parse a date from its string representation in R, we should use the lubridate package of the tidyverse collection. This package offers various functions for parsing a string and extracting the standard date from it based on the initial date pattern in that string. These functions are ymd(), ymd_hm(), ymd_hms(), dmy(), dmy_hm(), dmy_hms(), mdy(), mdy_hm(), mdy_hms(), etc., where y, m, d, h, m, and s…
-
How to create a new column in a data frame in R based on other columns?
1. Using the transform() and ifelse() functions of the base R: Output: 2. Using the with() and ifelse() functions of the base R: Output: 3. Using the apply() function of the base R: Output: 4. Using the mutate() function of the dplyr package and the ifelse() function of the base R: Output:
-
What is the difference between the subset() and sample() functions n R?
The subset() function in R is used for extracting rows and columns from a data frame or a matrix, or elements from a vector, based on certain conditions, e.g.: subset(my_vector, my_vector > 10). Instead, the sample() function in R can be applied only to vectors. It extracts a random sample of the predefined size from the elements of a vector,…