Excel Notes

Where does excel sits in Data Science?

When a transaction is made, data is getting created.

Then that data is saved in a database in a table form.

Then a portion of data is extracted from the database by someone sitting in an office for analysis purpose.

The extracted file is also known as flat file in format such as csv, txt, xlsx, etc.

Excel is used to open those files and see the structure of raw data. Excel can also be used for quick analysis.

different spreadsheet tools:

Excel, WPS, Libre Calc, Google Sheet, (Open Office excel), SAP, Canvas.

Most important thing we should not is, to what every single rows corresponds.

Means why each rows are created separately.

To what each rows corresponds?

What pairs of umpires went for umpiring most?

*********************

Custom sorting in Excel is used to sort in combination with multiple order such as first sort match id, then innings, then over, then balls.

Otherwise, it will sorting all balls together without considering order of over,inning or match. It will mix everything together to show only sorting of balls.

**************

= – equal to symbol show for calculated field in tableau.

****************

sumif – if range is equal to criteria, then take sum of the matching cells.

=sumif(range,criteria,[sum of range])

**************

countif

**********

maxifs – it will take maximum value from the range where the criteria is matching.

s is added in the formula for multiple criteria.

************

Number of wickets lost in first 5 overs for MI and DC

=sumifs(

practice sumifs

***********

check sumif vs countif for wicket total for a team.

***********************

********************

Concat formula

it helps to add value in the cells easily just by comma.

It does not require &. Instead of &, we can use comma.

If(condition, true, false)

Nested If –

IF(condition,true,if(condition,true,false))

**********

conditional concate using if function

***********

=find(“”,c2) – it will give position of space

=Left(C2,Fine(“”,C2) – It will help us to extract word from left side where the space is coming

such as Rovin sharma

we can get rovin from left till where the space is coming.

**************

vlookup look to the right side, not left side.

************

Xlookup

xlookup(lookup value, lookup array, return array)

It returns multiple cells in the range mentioned.

It also return values from the left side of the main value. unlike vookup which only show from right side of the value.

***********

To get unique list from a range

=unique(A2:A100)

We will get whatever unique name or value is there as a list.

If you selected entire column range – A:A, then it will also return 0, if there are empty cells in the range.

*********

Find the top 5 winning venues for MI

=large(array,2)

array – range

2 – 2nd highest value

Large formula is like Rank

*******************

=filter(array,include,)

array – put range from where you want the value or list.

include – B4:B20>=5

It is the condition.

Then it will show the list which is more than or equal to 5.

**********

Excel 3rd Class

Loyalytics Tableau Case study- 

Determine what percentage of registered customers had a customer ID but no nationality.

************

I want to see match id with result margin more than 50.

=filter(match id column, condition of more than 5 in run column)

**************

AND – multiplication by 1 is also AND referred by “*”.

true x true = true

1*1=1

true x false = false

1*0=0

false x false = false

0*0=0

**************

= sort formula

ascending order

descending order

************

Find winning % of who batted first for each venue

******************

Find a way to highlight max and min bar.

ask or check match formula

VLOOKUP

XLOOKUP

FILTER

COUNTIF

PIVOT

AVERAGE

SUM

LARGE

**************

Insert Table to convert into table format

We can use table format while creating pivot table, so that we can easily get new row data or column data in the pivot table by just refreshing the pivot.

It will help to not lose any value in newly added row or column.

***********************

Stats formula

example data:

4

1

2

8

5

2

1. Avg/Mean – total / count

2. median – first arrange in ascending order 

i.e. – 1,2,2,4,5,8

middle number – 2 and 4

2+4 divided by 2 = 3 – median

3. Quartile –

1,2,2,4,5,8,10,11,12

Quartile 3rd – also media

quartile 0 – 1

quartile 1 – 2

quartile 2 – 5 – 50% data before and after this point

quartile 3 – 10 – 75% data below and 25% above this point

quartile 4 – 12

********************

Q3 minus Q1 – IQR – Inter Quartile range

between these range, there are 50% data.

from 25% to 75% range.

**************

Mean vs Median

4,0,4,0,4,0

avg – 2

median – 4

median shows close to realistic trend.

mean may show high average, if just one value is very high. But average is not related to others.

******************

Standard Deviation

How much the value is away from the value.

=stddev(range)

**************

Variance

=var(range)

variance is square of standard deviation

Standard deviation and variance are connected.

**********

Calculative, Precise, Methodical

********************

MODE

Mode – most occuring value (mostly repeated value)

**********

Min – minimum value

Max – maximum value

************

What is the 25% of winning run – 25th percentage 

There are 5 types of quartile.

Quartile – 

=quartile(array,quart)

=quartile(range,1)

quartile include and exclude is there

***************

Percentile formula

**********

Learn sumproduct formula

************

MACRO – Automating repeated action

*******

in Excel for Windows operating system you might need to add developer tab. To add developer tab, click File, click Options, click Customize Ribbon, and check developer tab. You will find Macros button on Developer tab. I’m writing this on description below as well

*****

Assigning macro to shape

it will work like macro button

*********

Also, the macro will only work on the range which was selected while recording. It will miss other ranges out of the mentioned range.

*********************************

Slicer –

Slicer is a type of filter where we can extract a portion of table based on the single column filter.

It cannot be applied on value which is not formatted as table.

It is like filter on popup window.

*****************

round formula

floor formula to give the lowest value before decimal and removing decimal like 2.5 to 2

ceil formula – give highest round off value, like 2.5 to 3

=floor(cellname,6)

6 is closest multiple here.

*******

mod gives remainder when divided

=mod(legal,6) – what is 6 here ?

remainder after dividing by 6

*********************

******************

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top