Programació del Seminari: Any 2010



CONVIDAT: Francisco Saldanha-da-Gama, Department of Statistics and Operations Research/Operations Research Center, University of Lisbon, Portugal.

IDIOMA: Anglès

LLOC: Edifici C5, Aula: C5016, Campus Nord, UPC (veure mapa)

DATA: Divendres, 17 de desembre de 2010, Hora: 12:30

RESUM: A project scheduling problem is a problem defined by a set of activities, the precedence relations between them and the corresponding execution times. The goal is to define the optimal schedule for the activities so that some performance measure associated with the project is optimized. The makespan is the performance measure most used in practice. Often, specific resources that exist in a limited amount are required to execute the activities. In such case, each activity requires a specific amount of each type of resource. This extension of the basic problem is called the Resource-Constrained Project Scheduling Problem (RCPSP).

One major assumption in the RCPSP is that each resource has some specific function. However, in many practical situations (e.g. when human resources are involved), a resource can perform several functions leading to a so-called project scheduling problem with flexible resources (PSPFR). In this case, each resource has a set of skills or abilities that can perform.

The PSPFR arises in many practical situations. For instance, consulting companies often have to allocate multi-skilled teams to different projects. Software development companies also face this problem when people with different abilities are to be combined in different teams to contribute to the development of several products. The organization of evaluation teams is another example. In this case, multi-skilled teams are to be organized to visit departments, organizations or institutions and write an evaluation report.

In the simplest setting of the PSPFR, each activity requires several units of each skill. Additionally, each resource has several skills but can contribute with at most one skill-unit to each activity. Moreover, the allocation of each resource to some activity should last for the entire duration of the activity. The resources are assumed to be renewable, which means that after being used to perform some activity they can be assigned to some other activity. The goal is to decide for each activity, which resources should be allocated to it and when should they be scheduled so that the makespan is minimized, i.e, the project finishes as soon as possible. This is the problem addressed in this talk.

A mixed-integer linear programming formulation is proposed for the problem as well as several sets of additional inequalities. Due to the fact that some of these inequalities require a valid upper bound to the problem, a heuristic procedure is also proposed. Computational tests performed using randomly generated data are reported, which indicates that for instances of realistic size, the proposed model enlarged with the additional inequalities can be solved efficiently using a commercial package

El ponent: Francisco Saldanha da Gama is Assistant Professor of Operations Research at the Department of Statistics and Operations Research of the Faculty of Science, University of Lisbon. He received his Ph.D. in 2002 from the same University. He has taught undergraduate and postgraduate courses in various universities, including the Technical University of Lisbon and the University of Lisbon. He has regularly published papers in scientific international journals mostly in the areas of location theory, supply chain management, logistics and combinatorial optimization. He is a member of several national and international scientific organizations such as the EURO Working Group on Location Analysis of which he is one the current coordinators. His current research interests include stochastic combinatorial optimization problems, location theory and scheduling.

Clicar aquí per accedir a la web de Francisco Saldanha da Gama.



CONVIDAT: Javier Contreras, Departamento de Ingeniería Eléctrica, Universidad de Castilla- La Mancha..

IDIOMA: Espanyol

LLOC: Aula: C4002-Aula amfiteatre, Campus Nord, UPC (veure mapa) <--- Compte : lloc no habitual

DATA: Divendres, 14 de gener de 2011, Hora: 12:30

RESUM: This paper proposes a bilevel programming approach to determine the optimal contract price of dispatchable distributed generation (DG) units in distribution systems. Two different agents are considered in the model, namely, the distribution company (DisCo) and the owner of the DG. The former seeks the minimization of the payments incurred in attending the forecasted demand, while the latter seeks the maximization of his profit. To meet the expected demand, the DisCo has the option to purchase energy from any DG unit within its network and directly from the wholesale electricity market. A traditional distribution utility model with no competition among DG units is considered. The proposed model positions the DG owner in the outer optimization level and the DisCo in the inner one. This last optimization problem is substituted by its Karush-Kuhn-Tucker optimality conditions, turning the bilevel programming problem into an equivalent single-level nonlinear programming problem which is solved using commercially available software. Tests are performed in a modified IEEE 34-bus distribution network.

El ponent: clicar aquí per accedir a la web de Javier Contreras.




CONVIDAT: Yves Tillé, University of Neuchatel.

IDIOMA: Castellà.

LLOC: Edifici C6, Aula: C6003, Campus Nord, UPC (veure mapa)

DATA: Divendres, 3 de Desembre de 2010, Hora: 13:00

RESUM: In surveys sampling, two paradigms can be used to conduct statistical inference. In the design based-inference, the sample is selected randomly by means of a sampling design. One can then construct unbiased estimators with respect to the sampling design. Next, these estimators can eventually be improved by a calibration method on auxiliary information. In model-based inference, the sample is used to estimate a parameter of a model. Next, the model can be used to forecast the unobserved values of the population. With the observed values and the predicted unobserved values, any parameter of the population can then be estimated. In both cases, the main questions are: How to select the sample and how to estimate the parameters? We will discuss the interests of these approaches and show that under some models, the optimal strategies can be the same for the design-based and model-based approaches.

Web del ponent: clicar aquí per accedir a la web de Yves Tillé.



CONVIDAT: Ester Ruiz, Departamento de Estadística, Universidad Carlos III de Madrid.

IDIOMA: Castellà.

LLOC: Edifici C6, Aula: C6003, Campus Nord, UPC (veure mapa)

DATA: Divendres, 26 de Novembre de 2010, Hora: 12:30

RESUM: One advantage of state space models is that they deliver estimates of the unobserved components and predictions of future values of the observed series and their corresponding Prediction Mean Squared Errors (PMSE). However, assuming that the model specification is known, these PMSEs are obtained by running the Kalman filter with the true parameters substituted by consistent estimates. Consequently, they do not incorporate the uncertainty due to parameter estimation. We review new bootstrap procedures to estimate the PMSEs of the unobserved states and to construct prediction intervals of future observations that incorporate parameter uncertainty and do not rely on particular assumpions of the error distribution. The new bootstrap PMSEs of the unobserved states have smaller biases than those obtained with alternative procedures. Furthermore, the prediction intervals have better coverage properties. However, we show that the bootstrap PMSEs are not robust against miss-specification of the model. The results are illustrated by obtaining prediction intervals of the quarterly mortages changes and of the unobserved output gap in USA.

Web del ponent: clicar aquí per accedir a la web de Ester Ruiz.



CONVIDAT: Mercedes Landete Ruiz, CIO, Universidad Miguel Hernández, Elche.

IDIOMA: Castellà

LLOC: Edifici C6, Aula: C6003, Campus Nord, UPC (veure mapa)

DATA: Divendres, 19 de Novembre de 2010, Hora: 12:30

RESUM: El problema de localización de plantas sin capacidades es un clásico en la literatura de los problemas de localización. El objetivo es dar servicio a todos los clientes minimizando el coste total que resulta de sumar el coste de transporte desde las plantas hasta los clientes y el coste de apertura de las plantas. El problema se plantea cuando no existe ninguna planta en el sistema y se tiene que decidir tanto cuales de las ubicaciones candidatas van efectivamente a pagar por ser plantas efectivas como qué clientes se van a servir desde cada planta abierta. Aunque desde el punto de vista teórico el problema es complejo, tanto el objetivo como las condiciones son sencillos. Una modificación del planteamiento que ha suscitado mucho interés recientemente es la posibilidad de que las plantas que sirven a los clientes fallen bajo cierta probabilidad. Si algunas o todas las plantas pueden fallar durante el período de servicio hay que considerar soluciones que indiquen cómo actuar cuando esto ocurre. Para abordar el problema de localización de plantas sin capacidades cuando existe una probabilidad de fallo, se pueden plantear diversos modelos. Nosotros presentaremos varios modelos lineales tipo mediana. Revisaremos algunos modelos en la literatura y añadiremos otros que permiten procedimientos alternativos de resolución.

Web del ponent: clicar aquí per accedir a la web de Mercedes Landete.



CONVIDAT: Annie Morin, IRISA, Université de Rennes 1, Francia.

IDIOMA: Anglès.

LLOC: Edifici C6, Aula: C6003, Campus Nord, UPC (veure mapa)

DATA: Divendres, 5 de Novembre de 2010, Hora: 12:30

RESUM: Our investigation aims at exploring interactively the results of the Factorial Correspondence Analysis (FCA) (Benzécri, 1973) applied on images in order to be able to extract knowledge and to better interpret the results obtained from a FCA. We propose an interactive graphical tool, CAViz, which allows us to view and to extract knowledge from the FCA results on images. Originally the FCA is for the analysis of contingency tables. An application is Textual Data Analysis (TDA). In Textual Data Analysis, the contingency table crosses words and documents. For adapting FCA on images, the first step is to define the "visual" words in images (similar words in the texts). These words are constructed from local descriptors (SIFT, Scale Invariant Feature Transform) in images. CAViz projects clouds of points in factorial plans and allows viewing and extracting interesting information such as: characterizing words, important factors using FCA relevant indicators (representation quality and contribution to the inertia). An application to the Caltech4 base (Sivic and al., 2005) demonstrates the interest of CAViz for the analysis of the FCA results.

Web del ponent: clicar aquí para acceder a la web de Annie Morin.



CONVIDAT: Virgilio Gómez Rubio, Departamento de Matemática, Escuela de Ingenieros Industriales de Albacete, Universidad de Castilla-La Mancha.

IDIOMA: Castellà.

LLOC: Edifici C6, Aula: C6003, Campus Nord, UPC (veure mapa)

DATA: Divendres, 22 d'Octubre de 2010, Hora: 12:30

RESUM: La quitridiomicosis es una enfermedad infecciosa fatal en los anfibios que en los últimos años se ha extendido de manera alarmante a numerosas poblaciones de anfibios en todo el mundo. En mi charla presentaré el uso que hemos hecho de modelos Bayesianos para el estudio de la distribución de casos de esta enfermedad en la Península Ibérica. Los datos provienen de distintos muestreos en los que se contabilizó el número de anfibios muertos y el total de anfibios encontrados en diferentes puntos de muestreo. Hemos utilizado modelos de "Binomial cero inflada" para modelizar, por un lado, la prevalencia de la infección y, por otro, el hecho de que en muchos puntos de muestreo el número de casos encontrados fue de cero. Además, hemos evaluado el posible efecto de variables medioambientales en la probabilidad de infección y el estado de desarrollo de los anfibios. Los resultados más relevantes de este trabajo han sido publicados en la revista Ecology Letters y el artículo completo puede descargarse aqui.

Web del ponent: clicar aquí para acceder a la web de Virgilio Gómez.



CONVIDAT: Andrew Harvey, Faculty of Economics, University of Cambridge.

IDIOMA: Anglès.

LLOC: Edifici C6, Aula: C6003, Campus Nord, UPC (veure mapa)

DATA: Divendres, 15 d'Octubre de 2010, Hora: 12:30

RESUM: The asymptotic distribution of maximum likelihood estimators is derived for a class of exponential generalized autoregressive conditional heteroskedasticity (EGARCH) models. The result carries over to models for duration and realised volatility that use an exponential link function. A key feature of the model formulation is that the dynamics are driven by the score.

Web del ponent: clica aquí per accedir a la web de Andrew Harvey.



CONVIDAT: Mauricio G. C. Resende, AT&T Labs Research, Shannon Laboratory, New Jersey .

IDIOMA: Anglès.

LLOC: Edifici C5, Aula: C5016, Campus Nord, UPC (veure mapa)

DATA: Divendres, 10 de setembre de 2010, Hora: 12:30

RESUM: We consider a constrained two-dimensional packing problem where a fixed set of small weighted rectangles has to be placed without overlap into a larger stock rectangle so as to maximize the sum of the weights of the packed rectangles. The algorithm we propose hybridizes a new placement procedure with a genetic algorithm based on random keys. We also propose a new fitness function to drive the optimization. The approach is tested on a set of instances taken from the literature and compared with other approaches that had produced best known solutions for this problem.

This is joint work with José Gonçalves (U. of Porto)

Web del ponent: clica aquí per accedir a la web de Mauricio G. C. Resende.



CONVIDAT: Fabrizio Ruggeri, CNR IMATI, Milano, Italy.

IDIOMA: Anglès.

LLOC: FME, Edifici U, Aula: 004, Campus Sud, UPC, Pau Gargallo, 5, 02028 Barcelona (veure mapa)

DATA: Dimarts, 22 de juny de 2010, Hora: 12:30

RESUM: In a context of software reliability, two models are presented to describe the case of reliability decay, due to the introduction of new bugs. Since the introduction of bugs is an unobservable process, latent variables are considered to take this process in account. The two models are based, respectively, on a hidden Markov model and a self-exciting point process with latent variables. Refik Soyer (George Washington University) and Antonio Pievatolo (CNR IMATI) are the co-authors of the work.

Web del ponent: clica aquí per accedir a la web de Fabrizio Ruggeri.


CONVIDAT:Wenceslao González Manteiga, Departamento de Estatística e Investigación Operativa, Universidade de Santiago de Compostela.

IDIOMA: Castellà.

LLOC: FME, Edifici U, Aula: 003, Campus Sud, UPC, Pau Gargallo, 5, 02028 Barcelona (veure mapa)

DATA: Divendres, 18 de juny de 2010, Hora: 12:30

RESUM: The receiver operating characteristic curve (ROC curve) is a tool of extensive use to analyse the discrimination capability of a diagnostic variable in medical studies. In certain situations, the presence of a covariate related to the diagnostic variable can increase the discriminating power of the ROC curve. In this article we model the effect of the covariate over the diagnostic variable by means of nonparametric location-scale regression models. We propose a new nonparametric estimator of the conditional ROC curve and study its asymptotic properties. We also present some simulations and an illustration to a data set concerning diagnosis of diabetes. (Joint work with Juan Carlos Pardo-Fernández and Ingrid van Keilegom).

Web del ponent: clica aquí per accedir a la web de Wenceslao González Manteiga.


CONVIDAT: Josep Antoni Martín-Fernández, Departament d'Informàtica i Matemàtica Aplicada, Universitat de Girona.

IDIOMA: Català

LLOC: Edifici C5, Aula: C5016, Campus Nord, UPC (veure mapa)

DATA: Divendres, 21 de maig de 2010, Hora: 12:30

RESUM: Històricament les dades composicionals han estat definides com a realitzacions d'un vector aleatori de suma constant. Exemples típics són, entre d'altres, els vectors de proporcions, percentatges, i ppm. Els darrers avenços, especialment durant la darrera dècada, han posat de relleu que aquesta definició ha quedat obsoleta. La perspectiva més actual posa èmfasis en què, en realitat, és el propi investigador qui davant d'un problema decideix assumir que les seves dades són de tipus composicional. Aquesta decisió és un "punt-de-no-retorn" atès que qualsevol anàlisi estadística que es pretengui dur a terme ha de tenir en compte aquesta naturalesa específica. L'acrònim CODA (COmpositional Data Analysis) és la referència internacional per a designar la metodologia i el conjunt de tècniques estadístiques dissenyades en coherència amb la tipologia de les dades composicionals. CODA inclou des de conceptes tan bàsics, com són el centre composicional i la variabilitat total d'un conjunt de dades; a eines fonamentals, com la tècnica PCA-biplot i el model de distribució normal multivariant; o a tècniques més sofisticades per analitzar la descomposició de la variabilitat, com és el CODA-dendrograma.

Web del ponent: clica aquí per accedir a la web de Josep Antoni Martín


CONVIDAT: Robert Garfinkel, Operations and Information Management Department, University of Connecticut, USA.

IDIOMA: Anglès

LLOC: Edifici C5, Aula: C5016, Campus Nord, UPC (veure mapa)

DATA: Divendres, 7 de maig de 2010, Hora: 12:00

RESUM: Digital microfluidic biochips are rectangular arrays of electrodes, or cells, that enable precise manipulation of nanoliter-sized droplets of biological fluids and chemical reagents. Due to the safety-critical nature of their applications, biochips must be tested frequently, both off-line (e.g., post-manufacturing) and concurrent with assay execution. Under both scenarios, testing is accomplished by routing one or more test droplets across the chip and recording their arrival at the destination. In this paper we formalize the DMFB testing problem under the common objective of completion time minimization, including previously ignored constraints of droplet non-interference. Our contributions include a proof that the general version of the problem is NP- hard, tight lower-bounds for both off-line and concurrent testing, optimal and approximation algorithms for off-line testing of commonly used rectangular shaped biochips, as well as a concurrent testing heuristic producing solutions within 23-34% of the lower-bound in experiments conducted on datasets simulating varying percentages of biochip cells occupied by concurrently running assays.



CONVIDAT: Rumana Omar, Statistical Science, University College London.

IDIOMA: Anglès

LLOC: FME, Edifici U, Aula: 100 (1er pis), Campus Sud, UPC, Pau Gargallo, 5, 02028 Barcelona (veure mapa)

DATA: Dimecres, 5 de maig de 2010, Hora: 12:30

RESUM: Random effects (RE) models and marginal models based on generalised estimating equations (GEE) are frequently used to analyse longitudinal repeated measurements health studies where subject dropout is common. The results from these models can be biased depending on missing data. Missing completely at random (MCAR) implies that the missing observations are a random sample of all observations. Missing at random (MAR), means that the probability of being missing may depend on what has already been observed in the data. The most difficult category to handle is missing not at random (MNAR), where the probability of the observation being missing depends on its value. RE models require MAR assumption. Because marginal models are not based on likelihoods, they require data to be MCAR. When the data are Gaussian, the GEEs reduce to score equations and provided the correct correlation structure is applied the two types of models are equivalent and marginal models are then robust to MAR. However, equivalent marginal and RE models for Gaussian data may not necessarily produce identical parameter estimates due to missing data as GEEs may not reduce to score equations then, even in presence of MCAR. For binary data the marginal models require MCAR assumption. By definition neither RE or marginal models are robust to MNAR.

In practice, the extent to which missing observations cause bias to the parameter estimates of these models and affect their clinical and statistical significance is not clear. Limited simulation studies have been conducted. However, it is not clear what proportion of missingness leads to substantial bias or how sensitivity to missing data compares between cluster-level and cluster-varying covariates, and how the RE and marginal models compare in these scenarios. It is not known to what extent the marginal model is robust to misspecification of the working correlation matrix in presence of missing data and whether the strength of the intracluster correlation coefficient affects the bias in parameter estimates caused by MNAR. Furthermore these studies did not explore random parameters or non-Gaussian data. If standards are to improve in the design and analysis of incomplete longitudinal studies, a greater understanding of the extent to which missing data affects the parameter estimates is vital. The aim here is to explore the effects of dropout on parameter estimates of RE and marginal models for repeated measurements data for both Gaussian and binary outcomes by carrying out a thorough investigation using simulation studies. From the findings recommendations are made for the analysis of incomplete longitudinal studies.

Web del ponent: clica aquí per accedir a la web de Rumana Omar.


CONVIDAT: Peter McCullagh, Department of Statistics, University of Chicago.

IDIOMA: Anglès

LLOC: Edifici C5, Aula: C5016, Campus Nord, UPC (veure mapa)

DATA: Divendres, 19 de març de 2010, Hora: 12:30

RESUM: Although Bayes's theorem demands a prior that is a probability distribution on the parameter space, the formal application of Bayes's theorem sometimes generates sensible procedures from improper priors, Pitman's estimator being a good example. However, the formal application of improper priors may also lead to Bayes procedures that are paradoxical or otherwise unsatisfactory, prompting some authors to insist that all priors be proper. This paper begins with the observation that an improper measure on $\Theta$ satisfying a certain countability condition is in fact a probability distribution on the power set. We show how to extend a model in such a way that the parameter space is also extended to the power set. The conditions for Bayes's theorem are then satisfied under an additional finiteness condition, which is needed for the existence of a sampling region. Lack of interference is a key property of the extension, ensuring that the posterior distribution in the extended space is compatible with the original parameter space. Provided that the key finiteness condition is satisfied, this probabilistic analysis of the extended model may be interpreted as a vindication of improper Bayes procedures generated from the original model. The implications for marginalization paradoxes will be discussed.



CONVIDAT: Lluis M. Plà-Aragonès, Departament de Matemàtiques, Universitat de Lleida.

IDIOMA: Català

LLOC: Edifici C5, Aula: C5016, Campus Nord, UPC (veure mapa)

DATA: Divendres, 12 de març de 2010, Hora: 12:00

RESUM: Pig production is very important in Spain, the second larger producer within EU-25. The sector is becoming very specialized and modern OR methodologies applied. In this seminar the formulation and resolution of a two-stage stochastic linear programming model with recourse for sow farms producing piglets is presented. The proposed model considers a medium-term planning horizon and specifically allows optimal replacement and schedule of purchases for the first stage. This model takes into account sow herd dynamics, housing facilities, reproduction management, herd size and uncertain parameters like litter size, mortality and fertility rates. These last parameters are explicitly incorporated via a finite set of scenarios. The model is solved by using the algebraic modelling software OPL Studio from ILOG, in combination with the solver CPLEX. Results obtained with previous deterministic models assessing the suitability of the stochastic approach are also discussed. Furthermore, other refinements for practical application and future developments are presented as well.



CONVIDAT: Gilbert Laporte, Canada Research Chair in Distribution Management, HEC Montreal.

IDIOMA: Anglès

LLOC: Edifici C5, Aula: Aula: C5016, Campus Nord, UPC (veure mapa)

DATA: Dijous, 25 de febrer de 2010, Hora: 12:00

RESUM: In the Dynamic Multi-Period Vehicle Routing Problem, distribution orders originating at customer locations arrive dynamically at a distribution centre. Deliveries must be performed within a given time window (which may be several days). Orders are assigned to vehicles over a rolling horizon. A primary objective is to minimize the total distance traveled by the vehicles. Another objective is to balance the vehicles? workload over time. We propose a multi-stage mixed-integer programming formulation for the problem. We also develop and compare three heuristics. Tests performed on real data provided by a distributing company based in Sweden indicate that the proposed approach yields results that outperform those obtained by the company.



CONVIDAT: Carlo N. Lauro, Department of Mathematics and Statistics, University of Naples Federico II (Italy)


LLOC: Edifici C5, Aula: C5016, Campus Nord, UPC (veure mapa)

DATA: Dimarts, 12 de gener de 2010, Hora: 18:30

RESUM: Aim of this seminar is to underline the main contributions in the context of Factorial Conjoint Analysis. The integration of Conjoint Analysis with the exploratory tools of Multidimensional Data Analysis is the basis of different research strategies, proposed by the author, combining the common estimation method with its geometrical representation. Here we present a systematic and unitary review of some of these methodologies by taking into account their contribution to several open ended problems.



CONVIDAT: Antonio Ciampi, Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Canada.

IDIOMA: Anglès

LLOC: Edifici C5, Aula: C5016, Campus Nord, UPC (veure mapa)

DATA: Dimecres, 13 de gener de 2010, Hora: 12:30

RESUM: We discuss an approach to 2-way classification based on the chi-2 distance and correspondence analysis. We present in particular two classification algorithms: the first one operates a dimension reduction before applying clustering techniques to rows and columns. The second one, successively partitions the data matrix to extract several classification schemes rather than one. Applications to gene expression and web data are presented.