Thursday 4 July 2013

Thursday 4th July


Apache Hadoop
Doug Cutting, Chief Architect, Cloudera

Big data's time has come

graph of Moore's Law

same has held true for other aspects of IT, size, speed, cost etc exponential improvement

how we process data will change too - once industries can harness data we will see big changes there too.

some people love, others hate, the term 'big data'. Who cares?

what is big data?
  • its scalable - distributed, commoditized, reliable computing 
trends -

open source's time has come
open software - most based on Linux
at Apache (NFP, 140 projects), quality is emergent

  • is flexible - spend a lot of time upfront with relational databases - what are the columns going to be, because presupposing what the Qs are going to be. system should accept raw data, project schemas onto it. general notion of not pre-conceiving the data (schema on read) is the difference between big data vs traditional relational data. Data hard to move so as much as possible want to operate on data where it lives.
Apache Hadoop started as a batch system - started out trying to do build an index and search engine similar to what Bing does. Google published some papers on what they were doing, he decided that way a better way of doing things. Implemented that.  Hadoop is a massive batch processing system. if you want to do some processing

Has added tools to Hadoop ecosystem eg
  • Apache Pig - data flow language
  • Hive - SQL engine
  • Mahout - library of machine learning programs
  • HBase - after a paper from Google - lets you store values under keys in effectively huge table, incredible rates of insert and lookup. no SQL stores
  • cloudera impala - SQL engine - executes SQL queries interactively
  • Solr - search technology with distributed scalable search.
Hadoop becoming the big data operating system

User
Application
Operating System
Hardware

Big data fuels innovation

"more data beats better algorithms"

"need a platform you can shove your data into and have access to a rich set of tools to enable you to explore your data."

big data - spreadsheets on steroids

recommended video
Peter Norvig - The Unreasonable Effectiveness of Data

+---
Plenary: Playing Nicely in the Sandbox: Ed Tech, Education Researchers, & Business
Speakers: Alan Louie (Imagine K12 - incubator for startups who want to build for K12), Steve Schoettler (startup Junio), Nicole Forsgren Velasquez

Where is the win-win?

Alan
get 20K - have to spend 2 months in Staford area, 10 teams per 6 month period. took knowledge we have about startups and how it can be applied for K12.

Needs to be more collaboration between researchers and startups.

Edtech startups pick fast and cheap - exactly opposite for academics - go for good.

some people keep researchers as far away from startups as possible.

Nicole
Initially looked at as though she was an alien because she came from business

accept that people are going to monetise things. Edtech startups have data - so contact someone from a Faculty of Business.

Steve
sucker for hard problems.

has found that working with others requires a multidisciplinary approach. startup owners manual says start with cheap and scratchy and iterate to get it working. lots of startups are missing key elements.

often see great ideas but they havent thought of things like, how to get a teacher to use this? need for scalable, reliable technology.

have to join and bring together all disciplines to address these end to end problems.


mention of zynga.com - free online, social games

+---

Panel: Analytics for 21st Century Skills
Panelists: Rebecca Ferguson, Ruth Deakin Crick, Peter Foltz

what analytics literacies do people need for 21st century?
how can analytic tools/ technologies help people become literate?

literacy as
  • reading and writing
  • reading the environment - understanding the joint result of complexity among supporting elements (people, technology, context)
  • writing the environment - contributing, changing, augmenting, designing, modding, evaluating
new literacies of eLearning
  1. multimodal - communicate across multiple platforms eg storify, pinterest
  2. multiactor
  3. sociotechnical
  4. collaboration
  5. emergence - comfortable with continuously evolving environment
Peter
Data analytics for 21st century kills: focus on writing, speaking & communicating

  • higher level thinking skills becoming increasingly important in the workplace
  • common core PISA, ATC21S, DEAG
  • general competencies - cognitive, problem solving etc, leadership
assessment of the free expression of knowledge and skills

to demonstrate, learners must be able to process and generate it independently
think, talk and write using effective skills

MCQ doesn't cut if but hand scoring of written and spoken tests not scalable

how can technology meet this challenge:
  • convert written and spoken performance into measures of skills and abilities
  • need engageing and realistic items that train and test people with the context and content for the workplace
  • reliable, valide, efficient, cost effective
  • help personalisation, realtime feedback, decidion-making by teachers, administrators
therefore Pearson doing:

Automated language analysis
complexity of language can be distilled by mathematical methods - use computational linguistics, NLP, machine learning, automated speech recognition

eg writing
  • summative K-12 writing
  • formative writing practice with feedback
  • situation judgment tasks - what would you do?
  • tasks and simulations - write an email to your boss etc
speaking assessment
  • english/ foreign language proficiency
  • reading fluency
  • create a sentence with these three words
  • describe what you see in a video
team communication
  • predict team performance measure and warn instructors
  • automated monitors for students and teachers in learning chat rooms
Comments on PISA2015

how do we incorporate new kinds of thinking?

how do we emphasis students working together?

Q
Intelligent essay assessor - any progress beyond word count? Yes

Ruth
21C Skills: EnquiryBlogger

everybody has a little list

European Commission list

Learning to learn: perspective from Theory and Practice, Routledge

competence is more than one thing - includes identity, learning power, knowledge skilss & understanding = competence in the world   <-- used in EnquiryBlogger

See  in journal
Crick & Claxton (2004) Developing and Effective Lifelong Learning Inventory: the ELLI Project, Assessment in Education.

See also assessment article

Rebecca
EnquiryBlogger: reflection and relationships

EnquiryBlogger site

showed example of a real blog - built on standard collaborative Wordpress basis. Couldnt help but notice spelling

ELLI Spider: learning power




structuring knowledge


clicking on a blob will take you to that blog post

Mood view: managing mood

learningemergence.net/tools/enquiryblogger
Wordpress plugin




No comments:

Post a Comment